How to remove the âxa0 from list of strings in python
up vote
0
down vote
favorite
I have tried with the replace in python. But it wouldn't work.
my_list=[['the',
'production',
'business',
'environmentâxa0evaluating',
'the'],
['impact',
'of',
'the',
'environmental',
'influences',
'such'],
['as',
'political',
'economic',
'technological',
'sociodemographicâxa0']]
my_list.replace(u'xa0', ' ')
and
my_list[0].replace(u'xa0', ' ')
For this got the attribute error. AttributeError: 'list' object has no attribute 'replace'
How to remove this unwanted string from the list my_list?
python-3.x
add a comment |
up vote
0
down vote
favorite
I have tried with the replace in python. But it wouldn't work.
my_list=[['the',
'production',
'business',
'environmentâxa0evaluating',
'the'],
['impact',
'of',
'the',
'environmental',
'influences',
'such'],
['as',
'political',
'economic',
'technological',
'sociodemographicâxa0']]
my_list.replace(u'xa0', ' ')
and
my_list[0].replace(u'xa0', ' ')
For this got the attribute error. AttributeError: 'list' object has no attribute 'replace'
How to remove this unwanted string from the list my_list?
python-3.x
Your title says something else than what you post your question. Do you want justxa0
removed, or eitherâ
orxa0
, or exactly the text'âxa0'
?
– usr2564301
Nov 9 at 9:30
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I have tried with the replace in python. But it wouldn't work.
my_list=[['the',
'production',
'business',
'environmentâxa0evaluating',
'the'],
['impact',
'of',
'the',
'environmental',
'influences',
'such'],
['as',
'political',
'economic',
'technological',
'sociodemographicâxa0']]
my_list.replace(u'xa0', ' ')
and
my_list[0].replace(u'xa0', ' ')
For this got the attribute error. AttributeError: 'list' object has no attribute 'replace'
How to remove this unwanted string from the list my_list?
python-3.x
I have tried with the replace in python. But it wouldn't work.
my_list=[['the',
'production',
'business',
'environmentâxa0evaluating',
'the'],
['impact',
'of',
'the',
'environmental',
'influences',
'such'],
['as',
'political',
'economic',
'technological',
'sociodemographicâxa0']]
my_list.replace(u'xa0', ' ')
and
my_list[0].replace(u'xa0', ' ')
For this got the attribute error. AttributeError: 'list' object has no attribute 'replace'
How to remove this unwanted string from the list my_list?
python-3.x
python-3.x
asked Nov 9 at 8:52
9113303
30511
30511
Your title says something else than what you post your question. Do you want justxa0
removed, or eitherâ
orxa0
, or exactly the text'âxa0'
?
– usr2564301
Nov 9 at 9:30
add a comment |
Your title says something else than what you post your question. Do you want justxa0
removed, or eitherâ
orxa0
, or exactly the text'âxa0'
?
– usr2564301
Nov 9 at 9:30
Your title says something else than what you post your question. Do you want just
xa0
removed, or either â
or xa0
, or exactly the text 'âxa0'
?– usr2564301
Nov 9 at 9:30
Your title says something else than what you post your question. Do you want just
xa0
removed, or either â
or xa0
, or exactly the text 'âxa0'
?– usr2564301
Nov 9 at 9:30
add a comment |
3 Answers
3
active
oldest
votes
up vote
2
down vote
accepted
Use unicodedata
library. That way you can save more information from each word.
import unicodedata
final_list = [[unicodedata.normalize("NFKD", word) for word in ls] for ls in my_list]
To also replace â
with a
very_final_list = [[word.encode('ascii', 'ignore') for word in ls] for ls in final_list]
If you want to completely remove â
then you can
very_final_list = [[word.replace('â', '') for word in ls] for ls in final_list]
and to remove b'
in front of every string, decode it back to utf-8
So putting everything together,
import unicodedata
final_list = [[unicodedata.normalize("NFKD", word) for word in ls] for ls in my_list]
very_final_list = [[word.encode('ascii', 'ignore').decode('utf-8') for word in ls] for ls in final_list]
#very_final_list = [[word.replace('â', '') for word in ls] for ls in final_list]
And here is the final result:
[['the', 'production', 'business', 'environmenta evaluating', 'the'], ['impact', 'of', 'the', 'environmental', 'influences', 'such'], ['as', 'political', 'economic', 'technological', 'sociodemographica ']]
If you switch the very_final_list
statements, then this is the output
[['the', 'production', 'business', 'environment evaluating', 'the'], ['impact', 'of', 'the', 'environmental', 'influences', 'such'], ['as', 'political', 'economic', 'technological', 'sociodemographic ']]
:package unicodedata works for me. After removing null spaces and special symbols still some words appears like 'environmentâ evaluating', 'sociodemographicâ'. Why? –
– 9113303
Nov 9 at 9:25
1
@9113303 Here, I've fixed almost every issue it had previously. Let me know if it worked.
– Vineeth Sai
Nov 9 at 9:31
add a comment |
up vote
2
down vote
lst =
for l in my_list:
lst.append([s.replace(u'xa0','') for s in l])
Output:
[['the', 'production', 'business', 'environmentâevaluating', 'the'],
['impact', 'of', 'the', 'environmental', 'influences', 'such'],
['as', 'political', 'economic', 'technological', 'sociodemographicâ']]
Emmmm,The another answer,I think it break the structure of my_list
.But it's easy too.Only one line.
1
Ok,My fault.I will changed it.
– M cache
Nov 9 at 9:27
1
Better, thanks :) Your answer is what OP should be looking for, except that there is some ambiguity in the question. Leaving this as a reminder to check back later.
– usr2564301
Nov 9 at 9:31
add a comment |
up vote
1
down vote
Updated :
List of List Comprehension should make this work for you
[[w.replace("âxa0", " ") for w in words] for words in my_list]
Output
[['the', 'production', 'business', 'environment evaluating', 'the'],
['impact', 'of', 'the', 'environmental', 'influences', 'such'],
['as', 'political', 'economic', 'technological', 'sociodemographic ']]
I do not think he wants triple nested lists
– BlueSheepToken
Nov 9 at 9:33
1
Updated the answer. Added the triple by mistake.
– Ashok KS
Nov 9 at 9:35
@AshokKS Your output for some reason still has the special characters. Guess it was a typo ?
– Vineeth Sai
Nov 9 at 9:51
1
@VineethSai Corrected it now. Sorry. Please up vote if you found it useful.
– Ashok KS
Nov 9 at 13:15
add a comment |
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
accepted
Use unicodedata
library. That way you can save more information from each word.
import unicodedata
final_list = [[unicodedata.normalize("NFKD", word) for word in ls] for ls in my_list]
To also replace â
with a
very_final_list = [[word.encode('ascii', 'ignore') for word in ls] for ls in final_list]
If you want to completely remove â
then you can
very_final_list = [[word.replace('â', '') for word in ls] for ls in final_list]
and to remove b'
in front of every string, decode it back to utf-8
So putting everything together,
import unicodedata
final_list = [[unicodedata.normalize("NFKD", word) for word in ls] for ls in my_list]
very_final_list = [[word.encode('ascii', 'ignore').decode('utf-8') for word in ls] for ls in final_list]
#very_final_list = [[word.replace('â', '') for word in ls] for ls in final_list]
And here is the final result:
[['the', 'production', 'business', 'environmenta evaluating', 'the'], ['impact', 'of', 'the', 'environmental', 'influences', 'such'], ['as', 'political', 'economic', 'technological', 'sociodemographica ']]
If you switch the very_final_list
statements, then this is the output
[['the', 'production', 'business', 'environment evaluating', 'the'], ['impact', 'of', 'the', 'environmental', 'influences', 'such'], ['as', 'political', 'economic', 'technological', 'sociodemographic ']]
:package unicodedata works for me. After removing null spaces and special symbols still some words appears like 'environmentâ evaluating', 'sociodemographicâ'. Why? –
– 9113303
Nov 9 at 9:25
1
@9113303 Here, I've fixed almost every issue it had previously. Let me know if it worked.
– Vineeth Sai
Nov 9 at 9:31
add a comment |
up vote
2
down vote
accepted
Use unicodedata
library. That way you can save more information from each word.
import unicodedata
final_list = [[unicodedata.normalize("NFKD", word) for word in ls] for ls in my_list]
To also replace â
with a
very_final_list = [[word.encode('ascii', 'ignore') for word in ls] for ls in final_list]
If you want to completely remove â
then you can
very_final_list = [[word.replace('â', '') for word in ls] for ls in final_list]
and to remove b'
in front of every string, decode it back to utf-8
So putting everything together,
import unicodedata
final_list = [[unicodedata.normalize("NFKD", word) for word in ls] for ls in my_list]
very_final_list = [[word.encode('ascii', 'ignore').decode('utf-8') for word in ls] for ls in final_list]
#very_final_list = [[word.replace('â', '') for word in ls] for ls in final_list]
And here is the final result:
[['the', 'production', 'business', 'environmenta evaluating', 'the'], ['impact', 'of', 'the', 'environmental', 'influences', 'such'], ['as', 'political', 'economic', 'technological', 'sociodemographica ']]
If you switch the very_final_list
statements, then this is the output
[['the', 'production', 'business', 'environment evaluating', 'the'], ['impact', 'of', 'the', 'environmental', 'influences', 'such'], ['as', 'political', 'economic', 'technological', 'sociodemographic ']]
:package unicodedata works for me. After removing null spaces and special symbols still some words appears like 'environmentâ evaluating', 'sociodemographicâ'. Why? –
– 9113303
Nov 9 at 9:25
1
@9113303 Here, I've fixed almost every issue it had previously. Let me know if it worked.
– Vineeth Sai
Nov 9 at 9:31
add a comment |
up vote
2
down vote
accepted
up vote
2
down vote
accepted
Use unicodedata
library. That way you can save more information from each word.
import unicodedata
final_list = [[unicodedata.normalize("NFKD", word) for word in ls] for ls in my_list]
To also replace â
with a
very_final_list = [[word.encode('ascii', 'ignore') for word in ls] for ls in final_list]
If you want to completely remove â
then you can
very_final_list = [[word.replace('â', '') for word in ls] for ls in final_list]
and to remove b'
in front of every string, decode it back to utf-8
So putting everything together,
import unicodedata
final_list = [[unicodedata.normalize("NFKD", word) for word in ls] for ls in my_list]
very_final_list = [[word.encode('ascii', 'ignore').decode('utf-8') for word in ls] for ls in final_list]
#very_final_list = [[word.replace('â', '') for word in ls] for ls in final_list]
And here is the final result:
[['the', 'production', 'business', 'environmenta evaluating', 'the'], ['impact', 'of', 'the', 'environmental', 'influences', 'such'], ['as', 'political', 'economic', 'technological', 'sociodemographica ']]
If you switch the very_final_list
statements, then this is the output
[['the', 'production', 'business', 'environment evaluating', 'the'], ['impact', 'of', 'the', 'environmental', 'influences', 'such'], ['as', 'political', 'economic', 'technological', 'sociodemographic ']]
Use unicodedata
library. That way you can save more information from each word.
import unicodedata
final_list = [[unicodedata.normalize("NFKD", word) for word in ls] for ls in my_list]
To also replace â
with a
very_final_list = [[word.encode('ascii', 'ignore') for word in ls] for ls in final_list]
If you want to completely remove â
then you can
very_final_list = [[word.replace('â', '') for word in ls] for ls in final_list]
and to remove b'
in front of every string, decode it back to utf-8
So putting everything together,
import unicodedata
final_list = [[unicodedata.normalize("NFKD", word) for word in ls] for ls in my_list]
very_final_list = [[word.encode('ascii', 'ignore').decode('utf-8') for word in ls] for ls in final_list]
#very_final_list = [[word.replace('â', '') for word in ls] for ls in final_list]
And here is the final result:
[['the', 'production', 'business', 'environmenta evaluating', 'the'], ['impact', 'of', 'the', 'environmental', 'influences', 'such'], ['as', 'political', 'economic', 'technological', 'sociodemographica ']]
If you switch the very_final_list
statements, then this is the output
[['the', 'production', 'business', 'environment evaluating', 'the'], ['impact', 'of', 'the', 'environmental', 'influences', 'such'], ['as', 'political', 'economic', 'technological', 'sociodemographic ']]
edited Nov 9 at 9:43
answered Nov 9 at 9:09


Vineeth Sai
2,22431023
2,22431023
:package unicodedata works for me. After removing null spaces and special symbols still some words appears like 'environmentâ evaluating', 'sociodemographicâ'. Why? –
– 9113303
Nov 9 at 9:25
1
@9113303 Here, I've fixed almost every issue it had previously. Let me know if it worked.
– Vineeth Sai
Nov 9 at 9:31
add a comment |
:package unicodedata works for me. After removing null spaces and special symbols still some words appears like 'environmentâ evaluating', 'sociodemographicâ'. Why? –
– 9113303
Nov 9 at 9:25
1
@9113303 Here, I've fixed almost every issue it had previously. Let me know if it worked.
– Vineeth Sai
Nov 9 at 9:31
:package unicodedata works for me. After removing null spaces and special symbols still some words appears like 'environmentâ evaluating', 'sociodemographicâ'. Why? –
– 9113303
Nov 9 at 9:25
:package unicodedata works for me. After removing null spaces and special symbols still some words appears like 'environmentâ evaluating', 'sociodemographicâ'. Why? –
– 9113303
Nov 9 at 9:25
1
1
@9113303 Here, I've fixed almost every issue it had previously. Let me know if it worked.
– Vineeth Sai
Nov 9 at 9:31
@9113303 Here, I've fixed almost every issue it had previously. Let me know if it worked.
– Vineeth Sai
Nov 9 at 9:31
add a comment |
up vote
2
down vote
lst =
for l in my_list:
lst.append([s.replace(u'xa0','') for s in l])
Output:
[['the', 'production', 'business', 'environmentâevaluating', 'the'],
['impact', 'of', 'the', 'environmental', 'influences', 'such'],
['as', 'political', 'economic', 'technological', 'sociodemographicâ']]
Emmmm,The another answer,I think it break the structure of my_list
.But it's easy too.Only one line.
1
Ok,My fault.I will changed it.
– M cache
Nov 9 at 9:27
1
Better, thanks :) Your answer is what OP should be looking for, except that there is some ambiguity in the question. Leaving this as a reminder to check back later.
– usr2564301
Nov 9 at 9:31
add a comment |
up vote
2
down vote
lst =
for l in my_list:
lst.append([s.replace(u'xa0','') for s in l])
Output:
[['the', 'production', 'business', 'environmentâevaluating', 'the'],
['impact', 'of', 'the', 'environmental', 'influences', 'such'],
['as', 'political', 'economic', 'technological', 'sociodemographicâ']]
Emmmm,The another answer,I think it break the structure of my_list
.But it's easy too.Only one line.
1
Ok,My fault.I will changed it.
– M cache
Nov 9 at 9:27
1
Better, thanks :) Your answer is what OP should be looking for, except that there is some ambiguity in the question. Leaving this as a reminder to check back later.
– usr2564301
Nov 9 at 9:31
add a comment |
up vote
2
down vote
up vote
2
down vote
lst =
for l in my_list:
lst.append([s.replace(u'xa0','') for s in l])
Output:
[['the', 'production', 'business', 'environmentâevaluating', 'the'],
['impact', 'of', 'the', 'environmental', 'influences', 'such'],
['as', 'political', 'economic', 'technological', 'sociodemographicâ']]
Emmmm,The another answer,I think it break the structure of my_list
.But it's easy too.Only one line.
lst =
for l in my_list:
lst.append([s.replace(u'xa0','') for s in l])
Output:
[['the', 'production', 'business', 'environmentâevaluating', 'the'],
['impact', 'of', 'the', 'environmental', 'influences', 'such'],
['as', 'political', 'economic', 'technological', 'sociodemographicâ']]
Emmmm,The another answer,I think it break the structure of my_list
.But it's easy too.Only one line.
edited Nov 9 at 9:29
answered Nov 9 at 9:09
M cache
335
335
1
Ok,My fault.I will changed it.
– M cache
Nov 9 at 9:27
1
Better, thanks :) Your answer is what OP should be looking for, except that there is some ambiguity in the question. Leaving this as a reminder to check back later.
– usr2564301
Nov 9 at 9:31
add a comment |
1
Ok,My fault.I will changed it.
– M cache
Nov 9 at 9:27
1
Better, thanks :) Your answer is what OP should be looking for, except that there is some ambiguity in the question. Leaving this as a reminder to check back later.
– usr2564301
Nov 9 at 9:31
1
1
Ok,My fault.I will changed it.
– M cache
Nov 9 at 9:27
Ok,My fault.I will changed it.
– M cache
Nov 9 at 9:27
1
1
Better, thanks :) Your answer is what OP should be looking for, except that there is some ambiguity in the question. Leaving this as a reminder to check back later.
– usr2564301
Nov 9 at 9:31
Better, thanks :) Your answer is what OP should be looking for, except that there is some ambiguity in the question. Leaving this as a reminder to check back later.
– usr2564301
Nov 9 at 9:31
add a comment |
up vote
1
down vote
Updated :
List of List Comprehension should make this work for you
[[w.replace("âxa0", " ") for w in words] for words in my_list]
Output
[['the', 'production', 'business', 'environment evaluating', 'the'],
['impact', 'of', 'the', 'environmental', 'influences', 'such'],
['as', 'political', 'economic', 'technological', 'sociodemographic ']]
I do not think he wants triple nested lists
– BlueSheepToken
Nov 9 at 9:33
1
Updated the answer. Added the triple by mistake.
– Ashok KS
Nov 9 at 9:35
@AshokKS Your output for some reason still has the special characters. Guess it was a typo ?
– Vineeth Sai
Nov 9 at 9:51
1
@VineethSai Corrected it now. Sorry. Please up vote if you found it useful.
– Ashok KS
Nov 9 at 13:15
add a comment |
up vote
1
down vote
Updated :
List of List Comprehension should make this work for you
[[w.replace("âxa0", " ") for w in words] for words in my_list]
Output
[['the', 'production', 'business', 'environment evaluating', 'the'],
['impact', 'of', 'the', 'environmental', 'influences', 'such'],
['as', 'political', 'economic', 'technological', 'sociodemographic ']]
I do not think he wants triple nested lists
– BlueSheepToken
Nov 9 at 9:33
1
Updated the answer. Added the triple by mistake.
– Ashok KS
Nov 9 at 9:35
@AshokKS Your output for some reason still has the special characters. Guess it was a typo ?
– Vineeth Sai
Nov 9 at 9:51
1
@VineethSai Corrected it now. Sorry. Please up vote if you found it useful.
– Ashok KS
Nov 9 at 13:15
add a comment |
up vote
1
down vote
up vote
1
down vote
Updated :
List of List Comprehension should make this work for you
[[w.replace("âxa0", " ") for w in words] for words in my_list]
Output
[['the', 'production', 'business', 'environment evaluating', 'the'],
['impact', 'of', 'the', 'environmental', 'influences', 'such'],
['as', 'political', 'economic', 'technological', 'sociodemographic ']]
Updated :
List of List Comprehension should make this work for you
[[w.replace("âxa0", " ") for w in words] for words in my_list]
Output
[['the', 'production', 'business', 'environment evaluating', 'the'],
['impact', 'of', 'the', 'environmental', 'influences', 'such'],
['as', 'political', 'economic', 'technological', 'sociodemographic ']]
edited Nov 9 at 13:15
answered Nov 9 at 9:32
Ashok KS
201214
201214
I do not think he wants triple nested lists
– BlueSheepToken
Nov 9 at 9:33
1
Updated the answer. Added the triple by mistake.
– Ashok KS
Nov 9 at 9:35
@AshokKS Your output for some reason still has the special characters. Guess it was a typo ?
– Vineeth Sai
Nov 9 at 9:51
1
@VineethSai Corrected it now. Sorry. Please up vote if you found it useful.
– Ashok KS
Nov 9 at 13:15
add a comment |
I do not think he wants triple nested lists
– BlueSheepToken
Nov 9 at 9:33
1
Updated the answer. Added the triple by mistake.
– Ashok KS
Nov 9 at 9:35
@AshokKS Your output for some reason still has the special characters. Guess it was a typo ?
– Vineeth Sai
Nov 9 at 9:51
1
@VineethSai Corrected it now. Sorry. Please up vote if you found it useful.
– Ashok KS
Nov 9 at 13:15
I do not think he wants triple nested lists
– BlueSheepToken
Nov 9 at 9:33
I do not think he wants triple nested lists
– BlueSheepToken
Nov 9 at 9:33
1
1
Updated the answer. Added the triple by mistake.
– Ashok KS
Nov 9 at 9:35
Updated the answer. Added the triple by mistake.
– Ashok KS
Nov 9 at 9:35
@AshokKS Your output for some reason still has the special characters. Guess it was a typo ?
– Vineeth Sai
Nov 9 at 9:51
@AshokKS Your output for some reason still has the special characters. Guess it was a typo ?
– Vineeth Sai
Nov 9 at 9:51
1
1
@VineethSai Corrected it now. Sorry. Please up vote if you found it useful.
– Ashok KS
Nov 9 at 13:15
@VineethSai Corrected it now. Sorry. Please up vote if you found it useful.
– Ashok KS
Nov 9 at 13:15
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53222476%2fhow-to-remove-the-%25c3%25a2-xa0-from-list-of-strings-in-python%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Your title says something else than what you post your question. Do you want just
xa0
removed, or eitherâ
orxa0
, or exactly the text'âxa0'
?– usr2564301
Nov 9 at 9:30