Generate numpy array using multiple columns of pandas dataframe
up vote
0
down vote
favorite
Sorry for the long post.
I'm using python 3.6 on windows 10.I have a pandas data frame that contain around 100,000 rows. From this data frame I need to generate Four numpy arrays. First 5 relevant rows of my data frame looks like below
A B x UB1 LB1 UB2 LB2
0.2134 0.7866 0.2237 0.1567 0.0133 1.0499 0.127
0.24735 0.75265 0.0881 0.5905 0.422 1.4715 0.5185
0.0125 0.9875 0.1501 1.3721 0.5007 2.0866 2.0617
0.8365 0.1635 0.0948 1.9463 1.0854 2.4655 1.9644
0.1234 0.8766 0.0415 2.7903 2.2602 3.5192 3.2828
Column B is (1-Column A), Actually column B is not there in my data frame. I have added it to explain my problem
From this data frame, I need to generate three arrays. My arrays looks like
My array c looks like array([-0.2134, -0.7866,-0.24735, -0.75265,-0.0125, -0.9875,-0.8365, -0.1635,-0.1234, -0.8766],dtype=float32)
Where first element is first row of column A with added negative sign, similarly 2nd element is taken from 1st row of column B, third element is from second row of column A,fourth element is 2nd row of column B & so on
My second array UB looks like
array([ 0.2237, 0.0881, 0.1501, 0.0948, 0.0415, 0.2237],dtype=float32)
where elements are rows of column X.
My third array,bounds, looks like
array([[0.0133 , 0.1567],
[0.127 , 1.0499],
[0.422 , 0.5905],
[0.5185 , 1.4715],
[0.5007 , 1.3721],
[2.0617 , 2.0866],
[1.0854 , 1.9463],
[1.9644 , 2.4655],
[2.2602 , 2.7903],
[3.2828 , 3.5192]])
Where bounds[0][0] is first row of LB1,bounds[0][1] is first row of UB1. bounds[1][0] is first row of LB2, bounds [1][1] is first row of UB2. Again bounds[2][0] is 2nd row of LB1 & so on.
My fourth array looks like
array([[-1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, -1, 1, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, -1, 1, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, -1, 1, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, -1, 1]])
It contains same number of rows as data frame rows & column=2*data frame rows.
Can you please tell me for 100,000 rows of record what is the efficient way to generate these arrays
python arrays pandas
add a comment |
up vote
0
down vote
favorite
Sorry for the long post.
I'm using python 3.6 on windows 10.I have a pandas data frame that contain around 100,000 rows. From this data frame I need to generate Four numpy arrays. First 5 relevant rows of my data frame looks like below
A B x UB1 LB1 UB2 LB2
0.2134 0.7866 0.2237 0.1567 0.0133 1.0499 0.127
0.24735 0.75265 0.0881 0.5905 0.422 1.4715 0.5185
0.0125 0.9875 0.1501 1.3721 0.5007 2.0866 2.0617
0.8365 0.1635 0.0948 1.9463 1.0854 2.4655 1.9644
0.1234 0.8766 0.0415 2.7903 2.2602 3.5192 3.2828
Column B is (1-Column A), Actually column B is not there in my data frame. I have added it to explain my problem
From this data frame, I need to generate three arrays. My arrays looks like
My array c looks like array([-0.2134, -0.7866,-0.24735, -0.75265,-0.0125, -0.9875,-0.8365, -0.1635,-0.1234, -0.8766],dtype=float32)
Where first element is first row of column A with added negative sign, similarly 2nd element is taken from 1st row of column B, third element is from second row of column A,fourth element is 2nd row of column B & so on
My second array UB looks like
array([ 0.2237, 0.0881, 0.1501, 0.0948, 0.0415, 0.2237],dtype=float32)
where elements are rows of column X.
My third array,bounds, looks like
array([[0.0133 , 0.1567],
[0.127 , 1.0499],
[0.422 , 0.5905],
[0.5185 , 1.4715],
[0.5007 , 1.3721],
[2.0617 , 2.0866],
[1.0854 , 1.9463],
[1.9644 , 2.4655],
[2.2602 , 2.7903],
[3.2828 , 3.5192]])
Where bounds[0][0] is first row of LB1,bounds[0][1] is first row of UB1. bounds[1][0] is first row of LB2, bounds [1][1] is first row of UB2. Again bounds[2][0] is 2nd row of LB1 & so on.
My fourth array looks like
array([[-1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, -1, 1, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, -1, 1, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, -1, 1, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, -1, 1]])
It contains same number of rows as data frame rows & column=2*data frame rows.
Can you please tell me for 100,000 rows of record what is the efficient way to generate these arrays
python arrays pandas
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
Sorry for the long post.
I'm using python 3.6 on windows 10.I have a pandas data frame that contain around 100,000 rows. From this data frame I need to generate Four numpy arrays. First 5 relevant rows of my data frame looks like below
A B x UB1 LB1 UB2 LB2
0.2134 0.7866 0.2237 0.1567 0.0133 1.0499 0.127
0.24735 0.75265 0.0881 0.5905 0.422 1.4715 0.5185
0.0125 0.9875 0.1501 1.3721 0.5007 2.0866 2.0617
0.8365 0.1635 0.0948 1.9463 1.0854 2.4655 1.9644
0.1234 0.8766 0.0415 2.7903 2.2602 3.5192 3.2828
Column B is (1-Column A), Actually column B is not there in my data frame. I have added it to explain my problem
From this data frame, I need to generate three arrays. My arrays looks like
My array c looks like array([-0.2134, -0.7866,-0.24735, -0.75265,-0.0125, -0.9875,-0.8365, -0.1635,-0.1234, -0.8766],dtype=float32)
Where first element is first row of column A with added negative sign, similarly 2nd element is taken from 1st row of column B, third element is from second row of column A,fourth element is 2nd row of column B & so on
My second array UB looks like
array([ 0.2237, 0.0881, 0.1501, 0.0948, 0.0415, 0.2237],dtype=float32)
where elements are rows of column X.
My third array,bounds, looks like
array([[0.0133 , 0.1567],
[0.127 , 1.0499],
[0.422 , 0.5905],
[0.5185 , 1.4715],
[0.5007 , 1.3721],
[2.0617 , 2.0866],
[1.0854 , 1.9463],
[1.9644 , 2.4655],
[2.2602 , 2.7903],
[3.2828 , 3.5192]])
Where bounds[0][0] is first row of LB1,bounds[0][1] is first row of UB1. bounds[1][0] is first row of LB2, bounds [1][1] is first row of UB2. Again bounds[2][0] is 2nd row of LB1 & so on.
My fourth array looks like
array([[-1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, -1, 1, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, -1, 1, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, -1, 1, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, -1, 1]])
It contains same number of rows as data frame rows & column=2*data frame rows.
Can you please tell me for 100,000 rows of record what is the efficient way to generate these arrays
python arrays pandas
Sorry for the long post.
I'm using python 3.6 on windows 10.I have a pandas data frame that contain around 100,000 rows. From this data frame I need to generate Four numpy arrays. First 5 relevant rows of my data frame looks like below
A B x UB1 LB1 UB2 LB2
0.2134 0.7866 0.2237 0.1567 0.0133 1.0499 0.127
0.24735 0.75265 0.0881 0.5905 0.422 1.4715 0.5185
0.0125 0.9875 0.1501 1.3721 0.5007 2.0866 2.0617
0.8365 0.1635 0.0948 1.9463 1.0854 2.4655 1.9644
0.1234 0.8766 0.0415 2.7903 2.2602 3.5192 3.2828
Column B is (1-Column A), Actually column B is not there in my data frame. I have added it to explain my problem
From this data frame, I need to generate three arrays. My arrays looks like
My array c looks like array([-0.2134, -0.7866,-0.24735, -0.75265,-0.0125, -0.9875,-0.8365, -0.1635,-0.1234, -0.8766],dtype=float32)
Where first element is first row of column A with added negative sign, similarly 2nd element is taken from 1st row of column B, third element is from second row of column A,fourth element is 2nd row of column B & so on
My second array UB looks like
array([ 0.2237, 0.0881, 0.1501, 0.0948, 0.0415, 0.2237],dtype=float32)
where elements are rows of column X.
My third array,bounds, looks like
array([[0.0133 , 0.1567],
[0.127 , 1.0499],
[0.422 , 0.5905],
[0.5185 , 1.4715],
[0.5007 , 1.3721],
[2.0617 , 2.0866],
[1.0854 , 1.9463],
[1.9644 , 2.4655],
[2.2602 , 2.7903],
[3.2828 , 3.5192]])
Where bounds[0][0] is first row of LB1,bounds[0][1] is first row of UB1. bounds[1][0] is first row of LB2, bounds [1][1] is first row of UB2. Again bounds[2][0] is 2nd row of LB1 & so on.
My fourth array looks like
array([[-1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, -1, 1, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, -1, 1, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, -1, 1, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, -1, 1]])
It contains same number of rows as data frame rows & column=2*data frame rows.
Can you please tell me for 100,000 rows of record what is the efficient way to generate these arrays
python arrays pandas
python arrays pandas
asked Nov 8 at 10:49
Tanvi Mirza
79117
79117
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
up vote
1
down vote
accepted
This should be rather straightforward:
from io import StringIO
import pandas as pd
import numpy as np
data = """A B x UB1 LB1 UB2 LB2
0.2134 0.7866 0.2237 0.1567 0.0133 1.0499 0.127
0.24735 0.75265 0.0881 0.5905 0.422 1.4715 0.5185
0.0125 0.9875 0.1501 1.3721 0.5007 2.0866 2.0617
0.8365 0.1635 0.0948 1.9463 1.0854 2.4655 1.9644
0.1234 0.8766 0.0415 2.7903 2.2602 3.5192 3.2828"""
df = pd.read_csv(StringIO(data), sep='\s+', header=0)
c = -np.stack([df['A'], 1 - df['A']], axis=1).ravel()
print(c)
# [-0.2134 -0.7866 -0.24735 -0.75265 -0.0125 -0.9875 -0.8365 -0.1635
# -0.1234 -0.8766 ]
ub = df['x'].values
print(ub)
# [0.2237 0.0881 0.1501 0.0948 0.0415]
bounds = np.stack([df['LB1'], df['UB1'], df['LB2'], df['UB2']], axis=1).reshape((-1, 2))
print(bounds)
# [[0.0133 0.1567]
# [0.127 1.0499]
# [0.422 0.5905]
# [0.5185 1.4715]
# [0.5007 1.3721]
# [2.0617 2.0866]
# [1.0854 1.9463]
# [1.9644 2.4655]
# [2.2602 2.7903]
# [3.2828 3.5192]]
n = len(df)
fourth = np.zeros((n, 2 * n))
idx = np.arange(n)
fourth[idx, 2 * idx] = -1
fourth[idx, 2 * idx + 1] = 1
print(fourth)
# [[-1. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
# [ 0. 0. -1. 1. 0. 0. 0. 0. 0. 0.]
# [ 0. 0. 0. 0. -1. 1. 0. 0. 0. 0.]
# [ 0. 0. 0. 0. 0. 0. -1. 1. 0. 0.]
# [ 0. 0. 0. 0. 0. 0. 0. 0. -1. 1.]]
It works, Thanks a lot @jdehesa
– Tanvi Mirza
Nov 8 at 11:39
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
This should be rather straightforward:
from io import StringIO
import pandas as pd
import numpy as np
data = """A B x UB1 LB1 UB2 LB2
0.2134 0.7866 0.2237 0.1567 0.0133 1.0499 0.127
0.24735 0.75265 0.0881 0.5905 0.422 1.4715 0.5185
0.0125 0.9875 0.1501 1.3721 0.5007 2.0866 2.0617
0.8365 0.1635 0.0948 1.9463 1.0854 2.4655 1.9644
0.1234 0.8766 0.0415 2.7903 2.2602 3.5192 3.2828"""
df = pd.read_csv(StringIO(data), sep='\s+', header=0)
c = -np.stack([df['A'], 1 - df['A']], axis=1).ravel()
print(c)
# [-0.2134 -0.7866 -0.24735 -0.75265 -0.0125 -0.9875 -0.8365 -0.1635
# -0.1234 -0.8766 ]
ub = df['x'].values
print(ub)
# [0.2237 0.0881 0.1501 0.0948 0.0415]
bounds = np.stack([df['LB1'], df['UB1'], df['LB2'], df['UB2']], axis=1).reshape((-1, 2))
print(bounds)
# [[0.0133 0.1567]
# [0.127 1.0499]
# [0.422 0.5905]
# [0.5185 1.4715]
# [0.5007 1.3721]
# [2.0617 2.0866]
# [1.0854 1.9463]
# [1.9644 2.4655]
# [2.2602 2.7903]
# [3.2828 3.5192]]
n = len(df)
fourth = np.zeros((n, 2 * n))
idx = np.arange(n)
fourth[idx, 2 * idx] = -1
fourth[idx, 2 * idx + 1] = 1
print(fourth)
# [[-1. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
# [ 0. 0. -1. 1. 0. 0. 0. 0. 0. 0.]
# [ 0. 0. 0. 0. -1. 1. 0. 0. 0. 0.]
# [ 0. 0. 0. 0. 0. 0. -1. 1. 0. 0.]
# [ 0. 0. 0. 0. 0. 0. 0. 0. -1. 1.]]
It works, Thanks a lot @jdehesa
– Tanvi Mirza
Nov 8 at 11:39
add a comment |
up vote
1
down vote
accepted
This should be rather straightforward:
from io import StringIO
import pandas as pd
import numpy as np
data = """A B x UB1 LB1 UB2 LB2
0.2134 0.7866 0.2237 0.1567 0.0133 1.0499 0.127
0.24735 0.75265 0.0881 0.5905 0.422 1.4715 0.5185
0.0125 0.9875 0.1501 1.3721 0.5007 2.0866 2.0617
0.8365 0.1635 0.0948 1.9463 1.0854 2.4655 1.9644
0.1234 0.8766 0.0415 2.7903 2.2602 3.5192 3.2828"""
df = pd.read_csv(StringIO(data), sep='\s+', header=0)
c = -np.stack([df['A'], 1 - df['A']], axis=1).ravel()
print(c)
# [-0.2134 -0.7866 -0.24735 -0.75265 -0.0125 -0.9875 -0.8365 -0.1635
# -0.1234 -0.8766 ]
ub = df['x'].values
print(ub)
# [0.2237 0.0881 0.1501 0.0948 0.0415]
bounds = np.stack([df['LB1'], df['UB1'], df['LB2'], df['UB2']], axis=1).reshape((-1, 2))
print(bounds)
# [[0.0133 0.1567]
# [0.127 1.0499]
# [0.422 0.5905]
# [0.5185 1.4715]
# [0.5007 1.3721]
# [2.0617 2.0866]
# [1.0854 1.9463]
# [1.9644 2.4655]
# [2.2602 2.7903]
# [3.2828 3.5192]]
n = len(df)
fourth = np.zeros((n, 2 * n))
idx = np.arange(n)
fourth[idx, 2 * idx] = -1
fourth[idx, 2 * idx + 1] = 1
print(fourth)
# [[-1. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
# [ 0. 0. -1. 1. 0. 0. 0. 0. 0. 0.]
# [ 0. 0. 0. 0. -1. 1. 0. 0. 0. 0.]
# [ 0. 0. 0. 0. 0. 0. -1. 1. 0. 0.]
# [ 0. 0. 0. 0. 0. 0. 0. 0. -1. 1.]]
It works, Thanks a lot @jdehesa
– Tanvi Mirza
Nov 8 at 11:39
add a comment |
up vote
1
down vote
accepted
up vote
1
down vote
accepted
This should be rather straightforward:
from io import StringIO
import pandas as pd
import numpy as np
data = """A B x UB1 LB1 UB2 LB2
0.2134 0.7866 0.2237 0.1567 0.0133 1.0499 0.127
0.24735 0.75265 0.0881 0.5905 0.422 1.4715 0.5185
0.0125 0.9875 0.1501 1.3721 0.5007 2.0866 2.0617
0.8365 0.1635 0.0948 1.9463 1.0854 2.4655 1.9644
0.1234 0.8766 0.0415 2.7903 2.2602 3.5192 3.2828"""
df = pd.read_csv(StringIO(data), sep='\s+', header=0)
c = -np.stack([df['A'], 1 - df['A']], axis=1).ravel()
print(c)
# [-0.2134 -0.7866 -0.24735 -0.75265 -0.0125 -0.9875 -0.8365 -0.1635
# -0.1234 -0.8766 ]
ub = df['x'].values
print(ub)
# [0.2237 0.0881 0.1501 0.0948 0.0415]
bounds = np.stack([df['LB1'], df['UB1'], df['LB2'], df['UB2']], axis=1).reshape((-1, 2))
print(bounds)
# [[0.0133 0.1567]
# [0.127 1.0499]
# [0.422 0.5905]
# [0.5185 1.4715]
# [0.5007 1.3721]
# [2.0617 2.0866]
# [1.0854 1.9463]
# [1.9644 2.4655]
# [2.2602 2.7903]
# [3.2828 3.5192]]
n = len(df)
fourth = np.zeros((n, 2 * n))
idx = np.arange(n)
fourth[idx, 2 * idx] = -1
fourth[idx, 2 * idx + 1] = 1
print(fourth)
# [[-1. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
# [ 0. 0. -1. 1. 0. 0. 0. 0. 0. 0.]
# [ 0. 0. 0. 0. -1. 1. 0. 0. 0. 0.]
# [ 0. 0. 0. 0. 0. 0. -1. 1. 0. 0.]
# [ 0. 0. 0. 0. 0. 0. 0. 0. -1. 1.]]
This should be rather straightforward:
from io import StringIO
import pandas as pd
import numpy as np
data = """A B x UB1 LB1 UB2 LB2
0.2134 0.7866 0.2237 0.1567 0.0133 1.0499 0.127
0.24735 0.75265 0.0881 0.5905 0.422 1.4715 0.5185
0.0125 0.9875 0.1501 1.3721 0.5007 2.0866 2.0617
0.8365 0.1635 0.0948 1.9463 1.0854 2.4655 1.9644
0.1234 0.8766 0.0415 2.7903 2.2602 3.5192 3.2828"""
df = pd.read_csv(StringIO(data), sep='\s+', header=0)
c = -np.stack([df['A'], 1 - df['A']], axis=1).ravel()
print(c)
# [-0.2134 -0.7866 -0.24735 -0.75265 -0.0125 -0.9875 -0.8365 -0.1635
# -0.1234 -0.8766 ]
ub = df['x'].values
print(ub)
# [0.2237 0.0881 0.1501 0.0948 0.0415]
bounds = np.stack([df['LB1'], df['UB1'], df['LB2'], df['UB2']], axis=1).reshape((-1, 2))
print(bounds)
# [[0.0133 0.1567]
# [0.127 1.0499]
# [0.422 0.5905]
# [0.5185 1.4715]
# [0.5007 1.3721]
# [2.0617 2.0866]
# [1.0854 1.9463]
# [1.9644 2.4655]
# [2.2602 2.7903]
# [3.2828 3.5192]]
n = len(df)
fourth = np.zeros((n, 2 * n))
idx = np.arange(n)
fourth[idx, 2 * idx] = -1
fourth[idx, 2 * idx + 1] = 1
print(fourth)
# [[-1. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
# [ 0. 0. -1. 1. 0. 0. 0. 0. 0. 0.]
# [ 0. 0. 0. 0. -1. 1. 0. 0. 0. 0.]
# [ 0. 0. 0. 0. 0. 0. -1. 1. 0. 0.]
# [ 0. 0. 0. 0. 0. 0. 0. 0. -1. 1.]]
answered Nov 8 at 11:09
jdehesa
20.8k33050
20.8k33050
It works, Thanks a lot @jdehesa
– Tanvi Mirza
Nov 8 at 11:39
add a comment |
It works, Thanks a lot @jdehesa
– Tanvi Mirza
Nov 8 at 11:39
It works, Thanks a lot @jdehesa
– Tanvi Mirza
Nov 8 at 11:39
It works, Thanks a lot @jdehesa
– Tanvi Mirza
Nov 8 at 11:39
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53206161%2fgenerate-numpy-array-using-multiple-columns-of-pandas-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown