Pandas: Conditionally insert rows into DataFrame while iterating through rows

up vote
3
down vote

favorite

While iterating through the rows of a specific column in a Pandas DataFrame, I would like to add a new row below the currently iterated row, if the cell in the currently iterated row meets a certain condition.

Say for example:

df = pd.DataFrame(data = {'A': [0.15, 0.15, 0.7], 'B': [1500, 1500, 7000]})

DataFrame:

      A     B

0  0.15  1500

1  0.15  1500

2  0.70  7000

Attempt:

y = 100                             #An example scalar



i = 1



for x in df['A']:

    if x is not None:               #Values in 'A' are filled atm, but not necessarily.

        df.loc[i] = [None, x*y]     #Should insert None into 'A', and product into 'B'.

        df.index = df.index + 1     #Shift index? According to this S/O answer: https://stackoverflow.com/a/24284680/4909923

    i = i + 1



df.sort_index(inplace=True)         #Sort index?

I haven't been able to succeed so far; getting a shifted index numbering that doesn't start at 0, and rows seem not to be inserted in an orderly way:

      A     B

3  0.15  1500

4   NaN    70

5  0.70  7000

I tried various variants of this, trying to use applymap with a lambda function, but was not able to get it working.

Desired result:

      A     B

0  0.15  1500

1  None  15

2  0.15  1500

3  None  15

4  0.70  7000

5  None  70

asked Nov 10 at 10:37

Winterflags

1,35942151

As it stands, I see no use case for pandas. You're iterating (mostly a no no) and you want to insert rows (also not really a good thing in numpy, which is the underlying structure)
– roganjosh
Nov 10 at 10:39

@roganjosh I am already using Pandas, this is just a subset of a DataFrame script doing other things also. I need to be able to insert rows as the program will create a DataFrame depending on other factors, therefore not so desirable to preallocate the index (also while inserting rows is inefficient, it doesn't matter as I'm dealing with less than tens of rows, not thousands).
– Winterflags
Nov 10 at 10:43

So my point still stands. Drop pandas and deal with nested lists. Pandas is just getting in the way here.
– roganjosh
Nov 10 at 10:44

add a comment |

up vote
3
down vote

favorite

Say for example:

df = pd.DataFrame(data = {'A': [0.15, 0.15, 0.7], 'B': [1500, 1500, 7000]})

DataFrame:

      A     B

0  0.15  1500

1  0.15  1500

2  0.70  7000

Attempt:

y = 100                             #An example scalar



i = 1



for x in df['A']:

    if x is not None:               #Values in 'A' are filled atm, but not necessarily.

        df.loc[i] = [None, x*y]     #Should insert None into 'A', and product into 'B'.

        df.index = df.index + 1     #Shift index? According to this S/O answer: https://stackoverflow.com/a/24284680/4909923

    i = i + 1



df.sort_index(inplace=True)         #Sort index?

I haven't been able to succeed so far; getting a shifted index numbering that doesn't start at 0, and rows seem not to be inserted in an orderly way:

      A     B

3  0.15  1500

4   NaN    70

5  0.70  7000

I tried various variants of this, trying to use applymap with a lambda function, but was not able to get it working.

Desired result:

      A     B

0  0.15  1500

1  None  15

2  0.15  1500

3  None  15

4  0.70  7000

5  None  70

asked Nov 10 at 10:37

Winterflags

1,35942151

As it stands, I see no use case for pandas. You're iterating (mostly a no no) and you want to insert rows (also not really a good thing in numpy, which is the underlying structure)
– roganjosh
Nov 10 at 10:39

@roganjosh I am already using Pandas, this is just a subset of a DataFrame script doing other things also. I need to be able to insert rows as the program will create a DataFrame depending on other factors, therefore not so desirable to preallocate the index (also while inserting rows is inefficient, it doesn't matter as I'm dealing with less than tens of rows, not thousands).
– Winterflags
Nov 10 at 10:43

So my point still stands. Drop pandas and deal with nested lists. Pandas is just getting in the way here.
– roganjosh
Nov 10 at 10:44

add a comment |

up vote
3
down vote

favorite

Say for example:

df = pd.DataFrame(data = {'A': [0.15, 0.15, 0.7], 'B': [1500, 1500, 7000]})

DataFrame:

      A     B

0  0.15  1500

1  0.15  1500

2  0.70  7000

Attempt:

y = 100                             #An example scalar



i = 1



for x in df['A']:

    if x is not None:               #Values in 'A' are filled atm, but not necessarily.

        df.loc[i] = [None, x*y]     #Should insert None into 'A', and product into 'B'.

        df.index = df.index + 1     #Shift index? According to this S/O answer: https://stackoverflow.com/a/24284680/4909923

    i = i + 1



df.sort_index(inplace=True)         #Sort index?

I haven't been able to succeed so far; getting a shifted index numbering that doesn't start at 0, and rows seem not to be inserted in an orderly way:

      A     B

3  0.15  1500

4   NaN    70

5  0.70  7000

I tried various variants of this, trying to use applymap with a lambda function, but was not able to get it working.

Desired result:

      A     B

0  0.15  1500

1  None  15

2  0.15  1500

3  None  15

4  0.70  7000

5  None  70

asked Nov 10 at 10:37

Winterflags

1,35942151

Say for example:

df = pd.DataFrame(data = {'A': [0.15, 0.15, 0.7], 'B': [1500, 1500, 7000]})

DataFrame:

      A     B

0  0.15  1500

1  0.15  1500

2  0.70  7000

Attempt:

y = 100                             #An example scalar



i = 1



for x in df['A']:

    if x is not None:               #Values in 'A' are filled atm, but not necessarily.

        df.loc[i] = [None, x*y]     #Should insert None into 'A', and product into 'B'.

        df.index = df.index + 1     #Shift index? According to this S/O answer: https://stackoverflow.com/a/24284680/4909923

    i = i + 1



df.sort_index(inplace=True)         #Sort index?

I haven't been able to succeed so far; getting a shifted index numbering that doesn't start at 0, and rows seem not to be inserted in an orderly way:

      A     B

3  0.15  1500

4   NaN    70

5  0.70  7000

I tried various variants of this, trying to use applymap with a lambda function, but was not able to get it working.

Desired result:

      A     B

0  0.15  1500

1  None  15

2  0.15  1500

3  None  15

4  0.70  7000

5  None  70

python pandas

asked Nov 10 at 10:37

Winterflags

1,35942151

asked Nov 10 at 10:37

Winterflags

1,35942151

asked Nov 10 at 10:37

Winterflags

1,35942151

asked Nov 10 at 10:37

Winterflags

1,35942151

asked Nov 10 at 10:37

Winterflags

1,35942151

As it stands, I see no use case for pandas. You're iterating (mostly a no no) and you want to insert rows (also not really a good thing in numpy, which is the underlying structure)
– roganjosh
Nov 10 at 10:39

@roganjosh I am already using Pandas, this is just a subset of a DataFrame script doing other things also. I need to be able to insert rows as the program will create a DataFrame depending on other factors, therefore not so desirable to preallocate the index (also while inserting rows is inefficient, it doesn't matter as I'm dealing with less than tens of rows, not thousands).
– Winterflags
Nov 10 at 10:43

So my point still stands. Drop pandas and deal with nested lists. Pandas is just getting in the way here.
– roganjosh
Nov 10 at 10:44

add a comment |

As it stands, I see no use case for pandas. You're iterating (mostly a no no) and you want to insert rows (also not really a good thing in numpy, which is the underlying structure)
– roganjosh
Nov 10 at 10:39

@roganjosh I am already using Pandas, this is just a subset of a DataFrame script doing other things also. I need to be able to insert rows as the program will create a DataFrame depending on other factors, therefore not so desirable to preallocate the index (also while inserting rows is inefficient, it doesn't matter as I'm dealing with less than tens of rows, not thousands).
– Winterflags
Nov 10 at 10:43

So my point still stands. Drop pandas and deal with nested lists. Pandas is just getting in the way here.
– roganjosh
Nov 10 at 10:44

As it stands, I see no use case for pandas. You're iterating (mostly a no no) and you want to insert rows (also not really a good thing in numpy, which is the underlying structure)
– roganjosh
Nov 10 at 10:39

@roganjosh I am already using Pandas, this is just a subset of a DataFrame script doing other things also. I need to be able to insert rows as the program will create a DataFrame depending on other factors, therefore not so desirable to preallocate the index (also while inserting rows is inefficient, it doesn't matter as I'm dealing with less than tens of rows, not thousands).
– Winterflags
Nov 10 at 10:43

So my point still stands. Drop pandas and deal with nested lists. Pandas is just getting in the way here.
– roganjosh
Nov 10 at 10:44

add a comment |

2 Answers
2

active

oldest

votes

up vote
1
down vote

accepted

I believe you can use:

df = pd.DataFrame(data = {'A': [0.15, 0.15, 0.7], 

                          'B': [1500, 1500, 7000],

                          'C': [100, 200, 400]})



v = 100

L = 

for i, x in df.to_dict('index').items():

    print (x)

    #append dictionary

    L.append(x)

    #append new dictionary, for missing keys ('B, C') DataFrame constructor add NaNs 

    L.append({'A':x['A'] * v})



df = pd.DataFrame(L)

print (df)

       A       B      C

0   0.15  1500.0  100.0

1  15.00     NaN    NaN

2   0.15  1500.0  200.0

3  15.00     NaN    NaN

4   0.70  7000.0  400.0

5  70.00     NaN    NaN

edited Nov 10 at 12:00

answered Nov 10 at 10:59

jezrael

316k22256333

@Winterflags - So maybe easier should be looping by dictionaries, check edited answer.
– jezrael
Nov 10 at 12:00

add a comment |

up vote
1
down vote

It doesn't seem you need a manual loop here:

df = pd.DataFrame(data = {'A': [0.15, 0.15, 0.7], 'B': [1500, 1500, 7000]})



y = 100



# copy slice of dataframe

df_extra = df.loc[df['A'].notnull()].copy()



# assign A and B series values

df_extra = df_extra.assign(A=np.nan, B=(df_extra['A']*y).astype(int))



# increment index partially, required for sorting afterwards

df_extra.index += 0.5



# append, sort index, drop index

res = df.append(df_extra).sort_index().reset_index(drop=True)



print(res)



      A     B

0  0.15  1500

1   NaN    15

2  0.15  1500

3   NaN    15

4  0.70  7000

5   NaN    70

answered Nov 10 at 12:22

jpp

88.6k195199

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53238097%2fpandas-conditionally-insert-rows-into-dataframe-while-iterating-through-rows%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
1
down vote

accepted

I believe you can use:

df = pd.DataFrame(data = {'A': [0.15, 0.15, 0.7], 

                          'B': [1500, 1500, 7000],

                          'C': [100, 200, 400]})



v = 100

L = 

for i, x in df.to_dict('index').items():

    print (x)

    #append dictionary

    L.append(x)

    #append new dictionary, for missing keys ('B, C') DataFrame constructor add NaNs 

    L.append({'A':x['A'] * v})



df = pd.DataFrame(L)

print (df)

       A       B      C

0   0.15  1500.0  100.0

1  15.00     NaN    NaN

2   0.15  1500.0  200.0

3  15.00     NaN    NaN

4   0.70  7000.0  400.0

5  70.00     NaN    NaN

edited Nov 10 at 12:00

answered Nov 10 at 10:59

jezrael

316k22256333

@Winterflags - So maybe easier should be looping by dictionaries, check edited answer.
– jezrael
Nov 10 at 12:00

add a comment |

up vote
1
down vote

accepted

I believe you can use:

df = pd.DataFrame(data = {'A': [0.15, 0.15, 0.7], 

                          'B': [1500, 1500, 7000],

                          'C': [100, 200, 400]})



v = 100

L = 

for i, x in df.to_dict('index').items():

    print (x)

    #append dictionary

    L.append(x)

    #append new dictionary, for missing keys ('B, C') DataFrame constructor add NaNs 

    L.append({'A':x['A'] * v})



df = pd.DataFrame(L)

print (df)

       A       B      C

0   0.15  1500.0  100.0

1  15.00     NaN    NaN

2   0.15  1500.0  200.0

3  15.00     NaN    NaN

4   0.70  7000.0  400.0

5  70.00     NaN    NaN

edited Nov 10 at 12:00

answered Nov 10 at 10:59

jezrael

316k22256333

@Winterflags - So maybe easier should be looping by dictionaries, check edited answer.
– jezrael
Nov 10 at 12:00

add a comment |

up vote
1
down vote

accepted

I believe you can use:

df = pd.DataFrame(data = {'A': [0.15, 0.15, 0.7], 

                          'B': [1500, 1500, 7000],

                          'C': [100, 200, 400]})



v = 100

L = 

for i, x in df.to_dict('index').items():

    print (x)

    #append dictionary

    L.append(x)

    #append new dictionary, for missing keys ('B, C') DataFrame constructor add NaNs 

    L.append({'A':x['A'] * v})



df = pd.DataFrame(L)

print (df)

       A       B      C

0   0.15  1500.0  100.0

1  15.00     NaN    NaN

2   0.15  1500.0  200.0

3  15.00     NaN    NaN

4   0.70  7000.0  400.0

5  70.00     NaN    NaN

edited Nov 10 at 12:00

answered Nov 10 at 10:59

jezrael

316k22256333

I believe you can use:

df = pd.DataFrame(data = {'A': [0.15, 0.15, 0.7], 

                          'B': [1500, 1500, 7000],

                          'C': [100, 200, 400]})



v = 100

L = 

for i, x in df.to_dict('index').items():

    print (x)

    #append dictionary

    L.append(x)

    #append new dictionary, for missing keys ('B, C') DataFrame constructor add NaNs 

    L.append({'A':x['A'] * v})



df = pd.DataFrame(L)

print (df)

       A       B      C

0   0.15  1500.0  100.0

1  15.00     NaN    NaN

2   0.15  1500.0  200.0

3  15.00     NaN    NaN

4   0.70  7000.0  400.0

5  70.00     NaN    NaN

edited Nov 10 at 12:00

answered Nov 10 at 10:59

jezrael

316k22256333

edited Nov 10 at 12:00

answered Nov 10 at 10:59

jezrael

316k22256333

answered Nov 10 at 10:59

jezrael

316k22256333

answered Nov 10 at 10:59

jezrael

316k22256333

@Winterflags - So maybe easier should be looping by dictionaries, check edited answer.
– jezrael
Nov 10 at 12:00

add a comment |

@Winterflags - So maybe easier should be looping by dictionaries, check edited answer.
– jezrael
Nov 10 at 12:00

@Winterflags - So maybe easier should be looping by dictionaries, check edited answer.
– jezrael
Nov 10 at 12:00

add a comment |

up vote
1
down vote

It doesn't seem you need a manual loop here:

df = pd.DataFrame(data = {'A': [0.15, 0.15, 0.7], 'B': [1500, 1500, 7000]})



y = 100



# copy slice of dataframe

df_extra = df.loc[df['A'].notnull()].copy()



# assign A and B series values

df_extra = df_extra.assign(A=np.nan, B=(df_extra['A']*y).astype(int))



# increment index partially, required for sorting afterwards

df_extra.index += 0.5



# append, sort index, drop index

res = df.append(df_extra).sort_index().reset_index(drop=True)



print(res)



      A     B

0  0.15  1500

1   NaN    15

2  0.15  1500

3   NaN    15

4  0.70  7000

5   NaN    70

answered Nov 10 at 12:22

jpp

88.6k195199

add a comment |

up vote
1
down vote

It doesn't seem you need a manual loop here:

df = pd.DataFrame(data = {'A': [0.15, 0.15, 0.7], 'B': [1500, 1500, 7000]})



y = 100



# copy slice of dataframe

df_extra = df.loc[df['A'].notnull()].copy()



# assign A and B series values

df_extra = df_extra.assign(A=np.nan, B=(df_extra['A']*y).astype(int))



# increment index partially, required for sorting afterwards

df_extra.index += 0.5



# append, sort index, drop index

res = df.append(df_extra).sort_index().reset_index(drop=True)



print(res)



      A     B

0  0.15  1500

1   NaN    15

2  0.15  1500

3   NaN    15

4  0.70  7000

5   NaN    70

answered Nov 10 at 12:22

jpp

88.6k195199

add a comment |

up vote
1
down vote

It doesn't seem you need a manual loop here:

df = pd.DataFrame(data = {'A': [0.15, 0.15, 0.7], 'B': [1500, 1500, 7000]})



y = 100



# copy slice of dataframe

df_extra = df.loc[df['A'].notnull()].copy()



# assign A and B series values

df_extra = df_extra.assign(A=np.nan, B=(df_extra['A']*y).astype(int))



# increment index partially, required for sorting afterwards

df_extra.index += 0.5



# append, sort index, drop index

res = df.append(df_extra).sort_index().reset_index(drop=True)



print(res)



      A     B

0  0.15  1500

1   NaN    15

2  0.15  1500

3   NaN    15

4  0.70  7000

5   NaN    70

answered Nov 10 at 12:22

jpp

88.6k195199

It doesn't seem you need a manual loop here:

df = pd.DataFrame(data = {'A': [0.15, 0.15, 0.7], 'B': [1500, 1500, 7000]})



y = 100



# copy slice of dataframe

df_extra = df.loc[df['A'].notnull()].copy()



# assign A and B series values

df_extra = df_extra.assign(A=np.nan, B=(df_extra['A']*y).astype(int))



# increment index partially, required for sorting afterwards

df_extra.index += 0.5



# append, sort index, drop index

res = df.append(df_extra).sort_index().reset_index(drop=True)



print(res)



      A     B

0  0.15  1500

1   NaN    15

2  0.15  1500

3   NaN    15

4  0.70  7000

5   NaN    70

answered Nov 10 at 12:22

jpp

88.6k195199

answered Nov 10 at 12:22

jpp

88.6k195199

answered Nov 10 at 12:22

jpp

88.6k195199

answered Nov 10 at 12:22

jpp

88.6k195199

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

w47VbowVtm42gz nmfh y sGbbo,R0BIPjaBCljavoXpVVm0v4cDjXGsjRdQWolKfTYbNM4C

搜尋此網誌

Xtykutl