Repeated values when trying to Insert a row using the .insert()
up vote
0
down vote
favorite
I have a txt file I'm trying to make lists from. The txt file has 3 columns tab separated (index, word, tag).
Example of the txt file:
1 i PRP
2 want VBP
3 to TO
4 go VB
What I'm trying to do is add a beginning and end of sentence marker (<s>
for beginning and </s>
for the end).
My code is as follows:
trainfile=open("/Users/Desktop/training.txt").read().split('n')
from collections import Counter, defaultdict
trainlines=
for line in trainfile:
trainlines.append(line)
indexlist=
wordlist=
taglist=
word_tag_counts = defaultdict(Counter)
for line in trainlines:
if not line.strip():
continue
index, word, tags = line.split()
word_tag_counts[word.lower()][tags] += 1
indexlist.append(index)
if index == "1":
indexlist.insert(0, '0')
wordlist.insert(0, '<s>')
taglist.insert(0, '<s>')
else:
indexlist.append(index)
wordlist.append(word)
taglist.append(tags)
if word == '.':
taglist.append('</s>')
wordlist.append('</s>')
else:
continue
The issue I'm having is my results for the indexlist are:
['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0',...]
and wordlist and taglist are the same problem:
['<s>', '<s>', '<s>', '<s>', '<s>', '<s>', '<s>', '<s>', '<s>',...]
Why is the inserted property the only thing thats showing up in my entire list?
What I want for a final result is:
indexlist: [0, 1, 2, 3, 4, 5, 6, 0, 1, 2...]
wordlist: [<s>, i, want, to, go, home, </s>, <s>, ...]
python python-3.x
|
show 2 more comments
up vote
0
down vote
favorite
I have a txt file I'm trying to make lists from. The txt file has 3 columns tab separated (index, word, tag).
Example of the txt file:
1 i PRP
2 want VBP
3 to TO
4 go VB
What I'm trying to do is add a beginning and end of sentence marker (<s>
for beginning and </s>
for the end).
My code is as follows:
trainfile=open("/Users/Desktop/training.txt").read().split('n')
from collections import Counter, defaultdict
trainlines=
for line in trainfile:
trainlines.append(line)
indexlist=
wordlist=
taglist=
word_tag_counts = defaultdict(Counter)
for line in trainlines:
if not line.strip():
continue
index, word, tags = line.split()
word_tag_counts[word.lower()][tags] += 1
indexlist.append(index)
if index == "1":
indexlist.insert(0, '0')
wordlist.insert(0, '<s>')
taglist.insert(0, '<s>')
else:
indexlist.append(index)
wordlist.append(word)
taglist.append(tags)
if word == '.':
taglist.append('</s>')
wordlist.append('</s>')
else:
continue
The issue I'm having is my results for the indexlist are:
['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0',...]
and wordlist and taglist are the same problem:
['<s>', '<s>', '<s>', '<s>', '<s>', '<s>', '<s>', '<s>', '<s>',...]
Why is the inserted property the only thing thats showing up in my entire list?
What I want for a final result is:
indexlist: [0, 1, 2, 3, 4, 5, 6, 0, 1, 2...]
wordlist: [<s>, i, want, to, go, home, </s>, <s>, ...]
python python-3.x
for line in trainfile: trainlines.append(line)
is the same astrainlines = trainfile[:]
– Barmar
Nov 9 at 23:20
oh sweet, thanks! thats a much better way. appreciate it.
– ChicJaab
Nov 9 at 23:21
1
I tried your script. I getindexlist = ['0', '1', '2', '2', '3', '3', '4', '4'] wordlist = ['<s>', 'want', 'to', 'go'] taglist = ['<s>', 'VBP', 'TO', 'VB']
– Barmar
Nov 9 at 23:24
It's also not clear why you need both variablestrainfile
andtrainlines
.
– Barmar
Nov 9 at 23:25
I'm doing a lot more in this code than what I'm showing above. The train file is the file I'm obtaining all of my data from but later i'm editing the lines and changing whats in them. I don't want to edit the actual information in the file or have to keep calling to the file.
– ChicJaab
Nov 9 at 23:29
|
show 2 more comments
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I have a txt file I'm trying to make lists from. The txt file has 3 columns tab separated (index, word, tag).
Example of the txt file:
1 i PRP
2 want VBP
3 to TO
4 go VB
What I'm trying to do is add a beginning and end of sentence marker (<s>
for beginning and </s>
for the end).
My code is as follows:
trainfile=open("/Users/Desktop/training.txt").read().split('n')
from collections import Counter, defaultdict
trainlines=
for line in trainfile:
trainlines.append(line)
indexlist=
wordlist=
taglist=
word_tag_counts = defaultdict(Counter)
for line in trainlines:
if not line.strip():
continue
index, word, tags = line.split()
word_tag_counts[word.lower()][tags] += 1
indexlist.append(index)
if index == "1":
indexlist.insert(0, '0')
wordlist.insert(0, '<s>')
taglist.insert(0, '<s>')
else:
indexlist.append(index)
wordlist.append(word)
taglist.append(tags)
if word == '.':
taglist.append('</s>')
wordlist.append('</s>')
else:
continue
The issue I'm having is my results for the indexlist are:
['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0',...]
and wordlist and taglist are the same problem:
['<s>', '<s>', '<s>', '<s>', '<s>', '<s>', '<s>', '<s>', '<s>',...]
Why is the inserted property the only thing thats showing up in my entire list?
What I want for a final result is:
indexlist: [0, 1, 2, 3, 4, 5, 6, 0, 1, 2...]
wordlist: [<s>, i, want, to, go, home, </s>, <s>, ...]
python python-3.x
I have a txt file I'm trying to make lists from. The txt file has 3 columns tab separated (index, word, tag).
Example of the txt file:
1 i PRP
2 want VBP
3 to TO
4 go VB
What I'm trying to do is add a beginning and end of sentence marker (<s>
for beginning and </s>
for the end).
My code is as follows:
trainfile=open("/Users/Desktop/training.txt").read().split('n')
from collections import Counter, defaultdict
trainlines=
for line in trainfile:
trainlines.append(line)
indexlist=
wordlist=
taglist=
word_tag_counts = defaultdict(Counter)
for line in trainlines:
if not line.strip():
continue
index, word, tags = line.split()
word_tag_counts[word.lower()][tags] += 1
indexlist.append(index)
if index == "1":
indexlist.insert(0, '0')
wordlist.insert(0, '<s>')
taglist.insert(0, '<s>')
else:
indexlist.append(index)
wordlist.append(word)
taglist.append(tags)
if word == '.':
taglist.append('</s>')
wordlist.append('</s>')
else:
continue
The issue I'm having is my results for the indexlist are:
['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0',...]
and wordlist and taglist are the same problem:
['<s>', '<s>', '<s>', '<s>', '<s>', '<s>', '<s>', '<s>', '<s>',...]
Why is the inserted property the only thing thats showing up in my entire list?
What I want for a final result is:
indexlist: [0, 1, 2, 3, 4, 5, 6, 0, 1, 2...]
wordlist: [<s>, i, want, to, go, home, </s>, <s>, ...]
python python-3.x
python python-3.x
edited Nov 10 at 0:41
martineau
64.9k987176
64.9k987176
asked Nov 9 at 23:11
ChicJaab
165
165
for line in trainfile: trainlines.append(line)
is the same astrainlines = trainfile[:]
– Barmar
Nov 9 at 23:20
oh sweet, thanks! thats a much better way. appreciate it.
– ChicJaab
Nov 9 at 23:21
1
I tried your script. I getindexlist = ['0', '1', '2', '2', '3', '3', '4', '4'] wordlist = ['<s>', 'want', 'to', 'go'] taglist = ['<s>', 'VBP', 'TO', 'VB']
– Barmar
Nov 9 at 23:24
It's also not clear why you need both variablestrainfile
andtrainlines
.
– Barmar
Nov 9 at 23:25
I'm doing a lot more in this code than what I'm showing above. The train file is the file I'm obtaining all of my data from but later i'm editing the lines and changing whats in them. I don't want to edit the actual information in the file or have to keep calling to the file.
– ChicJaab
Nov 9 at 23:29
|
show 2 more comments
for line in trainfile: trainlines.append(line)
is the same astrainlines = trainfile[:]
– Barmar
Nov 9 at 23:20
oh sweet, thanks! thats a much better way. appreciate it.
– ChicJaab
Nov 9 at 23:21
1
I tried your script. I getindexlist = ['0', '1', '2', '2', '3', '3', '4', '4'] wordlist = ['<s>', 'want', 'to', 'go'] taglist = ['<s>', 'VBP', 'TO', 'VB']
– Barmar
Nov 9 at 23:24
It's also not clear why you need both variablestrainfile
andtrainlines
.
– Barmar
Nov 9 at 23:25
I'm doing a lot more in this code than what I'm showing above. The train file is the file I'm obtaining all of my data from but later i'm editing the lines and changing whats in them. I don't want to edit the actual information in the file or have to keep calling to the file.
– ChicJaab
Nov 9 at 23:29
for line in trainfile: trainlines.append(line)
is the same as trainlines = trainfile[:]
– Barmar
Nov 9 at 23:20
for line in trainfile: trainlines.append(line)
is the same as trainlines = trainfile[:]
– Barmar
Nov 9 at 23:20
oh sweet, thanks! thats a much better way. appreciate it.
– ChicJaab
Nov 9 at 23:21
oh sweet, thanks! thats a much better way. appreciate it.
– ChicJaab
Nov 9 at 23:21
1
1
I tried your script. I get
indexlist = ['0', '1', '2', '2', '3', '3', '4', '4'] wordlist = ['<s>', 'want', 'to', 'go'] taglist = ['<s>', 'VBP', 'TO', 'VB']
– Barmar
Nov 9 at 23:24
I tried your script. I get
indexlist = ['0', '1', '2', '2', '3', '3', '4', '4'] wordlist = ['<s>', 'want', 'to', 'go'] taglist = ['<s>', 'VBP', 'TO', 'VB']
– Barmar
Nov 9 at 23:24
It's also not clear why you need both variables
trainfile
and trainlines
.– Barmar
Nov 9 at 23:25
It's also not clear why you need both variables
trainfile
and trainlines
.– Barmar
Nov 9 at 23:25
I'm doing a lot more in this code than what I'm showing above. The train file is the file I'm obtaining all of my data from but later i'm editing the lines and changing whats in them. I don't want to edit the actual information in the file or have to keep calling to the file.
– ChicJaab
Nov 9 at 23:29
I'm doing a lot more in this code than what I'm showing above. The train file is the file I'm obtaining all of my data from but later i'm editing the lines and changing whats in them. I don't want to edit the actual information in the file or have to keep calling to the file.
– ChicJaab
Nov 9 at 23:29
|
show 2 more comments
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53234432%2frepeated-values-when-trying-to-insert-a-row-using-the-insert%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
for line in trainfile: trainlines.append(line)
is the same astrainlines = trainfile[:]
– Barmar
Nov 9 at 23:20
oh sweet, thanks! thats a much better way. appreciate it.
– ChicJaab
Nov 9 at 23:21
1
I tried your script. I get
indexlist = ['0', '1', '2', '2', '3', '3', '4', '4'] wordlist = ['<s>', 'want', 'to', 'go'] taglist = ['<s>', 'VBP', 'TO', 'VB']
– Barmar
Nov 9 at 23:24
It's also not clear why you need both variables
trainfile
andtrainlines
.– Barmar
Nov 9 at 23:25
I'm doing a lot more in this code than what I'm showing above. The train file is the file I'm obtaining all of my data from but later i'm editing the lines and changing whats in them. I don't want to edit the actual information in the file or have to keep calling to the file.
– ChicJaab
Nov 9 at 23:29