How can I do filtering between two matrix?
up vote
7
down vote
favorite
File1:
91 23 56 44 87 77
99 34 56 22 22 95
41 88 26 79 60 27
95 55 66 69 92 25
File2:
pass fail pass pass pass fail
pass fail pass fail fail pass
pass pass fail pass pass fail
pass pass fail pass pass fail
As I want to sum up the total fail marks for each row, here is the expected output.
output:
100
78
53
91
I would like to ask that how can I do the filtering on file1 based on the word "fail" in file2 in order to get the sum of fail marks.
text-processing
add a comment |
up vote
7
down vote
favorite
File1:
91 23 56 44 87 77
99 34 56 22 22 95
41 88 26 79 60 27
95 55 66 69 92 25
File2:
pass fail pass pass pass fail
pass fail pass fail fail pass
pass pass fail pass pass fail
pass pass fail pass pass fail
As I want to sum up the total fail marks for each row, here is the expected output.
output:
100
78
53
91
I would like to ask that how can I do the filtering on file1 based on the word "fail" in file2 in order to get the sum of fail marks.
text-processing
What is producing these two files and can't that program do this?
– Kusalananda
Nov 8 at 9:42
add a comment |
up vote
7
down vote
favorite
up vote
7
down vote
favorite
File1:
91 23 56 44 87 77
99 34 56 22 22 95
41 88 26 79 60 27
95 55 66 69 92 25
File2:
pass fail pass pass pass fail
pass fail pass fail fail pass
pass pass fail pass pass fail
pass pass fail pass pass fail
As I want to sum up the total fail marks for each row, here is the expected output.
output:
100
78
53
91
I would like to ask that how can I do the filtering on file1 based on the word "fail" in file2 in order to get the sum of fail marks.
text-processing
File1:
91 23 56 44 87 77
99 34 56 22 22 95
41 88 26 79 60 27
95 55 66 69 92 25
File2:
pass fail pass pass pass fail
pass fail pass fail fail pass
pass pass fail pass pass fail
pass pass fail pass pass fail
As I want to sum up the total fail marks for each row, here is the expected output.
output:
100
78
53
91
I would like to ask that how can I do the filtering on file1 based on the word "fail" in file2 in order to get the sum of fail marks.
text-processing
text-processing
edited Nov 8 at 13:05
Braiam
22.9k1972135
22.9k1972135
asked Nov 8 at 8:23
Owen
524
524
What is producing these two files and can't that program do this?
– Kusalananda
Nov 8 at 9:42
add a comment |
What is producing these two files and can't that program do this?
– Kusalananda
Nov 8 at 9:42
What is producing these two files and can't that program do this?
– Kusalananda
Nov 8 at 9:42
What is producing these two files and can't that program do this?
– Kusalananda
Nov 8 at 9:42
add a comment |
7 Answers
7
active
oldest
votes
up vote
4
down vote
accepted
I don't think you need an END
section:
awk '
NR == FNR {for (i=1; i<=NF; i++) F[i,NR] = $i
next
}
{T = 0
for (i=1; i<=NF; i++) T += ($i=="fail")?F[i,FNR]:0
print T
}
' file[12]
100
78
53
91
You are right, END section is redundant, +1.
– jimmij
Nov 8 at 15:48
add a comment |
up vote
10
down vote
I would use a matrix language for such a task, e.g. GNU Octave.
Assuming you converted the pass/fail file into numerical values, e.g.:
sed 's/pass/1/g; s/fail/0/g' passfail > passfail.nums
You can now do the following:
marks = dlmread('marks');
passfail = dlmread('passfail.nums');
for i = 1:size(marks)(1)
sum(marks(i,:)(passfail(i,:) == 0))
end
Output:
ans = 100
ans = 78
ans = 53
ans = 91
add a comment |
up vote
7
down vote
While I think using awk
is good for portability, other languages seem easier to write and read for this task. GNU Octave was mentioned but does not come pre-installed on most machines. On the other hand, most systems have a version of python preinstalled. Here is a python version:
for marks, decisions in zip(open('file1').readlines(), open('file2').readlines()):
row_score = 0
for mark, decision in zip(marks.split(), decisions.split()):
if decision == 'fail':
row_score += int(mark)
print(row_score)
which returns the outputs you expected.
add a comment |
up vote
6
down vote
Here is my awk
approach:
awk 'NR==FNR{for(i=1;i<=NF;i++) a[NR"-"i]=$i; next}
{for(j=1;j<=NF;j++) if($j=="fail") b[FNR]+=a[FNR"-"j]}
END{for(k in b) print b[k]}' file1 file2
Awk doesn't support two-dimensional arrays, so we cooked ones by combining two numbers (row and field) in the same array index. The output is:
100
78
53
91
add a comment |
up vote
3
down vote
awk '
BEGIN{ pf=ARGV[2]; ARGV[2]="" }
{ getline l <pf; split(l, a); n=0;
for(i=1;i<=NF;i++) if(a[i]=="fail") n+=$i;
print n }
' file1 file2
100
78
53
91
Just like @Maxim's python version, but unlike all the other answers, this is processing the two files in parallel, line by line, instead of loading one of them whole into memory.
add a comment |
up vote
2
down vote
I guess using an Awk
script would make this requirement a bit easy to solve. Do something like below. I guess its a bit slower than now posted jimmij's answer
#!/usr/bin/awk -f
FNR == NR {
for(i=1;i<=NF;i++)
if ( $i == "fail")
idxArray[FNR] = (idxArray[FNR]) ? (idxArray[FNR]" "i):(i)
next
}{
delete Array
delete Line
i=""
j=""
sum=""
n=split(idxArray[FNR],Array," ")
l=split($0,Line," ")
for (i=1;i<=n;i++)
for (j=1;j<=l;j++)
if (Array[i] == j )
sum += Line[j]
print sum
}
and run the script as
awk -f script.awk file2 file1
add a comment |
up vote
0
down vote
One-liner:
paste file[12] | awk '{T=0; for (i=1; i<=NF/2; i++) T += ($(i+NF/2)=="fail")?$i:0; print T}'
100
78
53
91
add a comment |
7 Answers
7
active
oldest
votes
7 Answers
7
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
4
down vote
accepted
I don't think you need an END
section:
awk '
NR == FNR {for (i=1; i<=NF; i++) F[i,NR] = $i
next
}
{T = 0
for (i=1; i<=NF; i++) T += ($i=="fail")?F[i,FNR]:0
print T
}
' file[12]
100
78
53
91
You are right, END section is redundant, +1.
– jimmij
Nov 8 at 15:48
add a comment |
up vote
4
down vote
accepted
I don't think you need an END
section:
awk '
NR == FNR {for (i=1; i<=NF; i++) F[i,NR] = $i
next
}
{T = 0
for (i=1; i<=NF; i++) T += ($i=="fail")?F[i,FNR]:0
print T
}
' file[12]
100
78
53
91
You are right, END section is redundant, +1.
– jimmij
Nov 8 at 15:48
add a comment |
up vote
4
down vote
accepted
up vote
4
down vote
accepted
I don't think you need an END
section:
awk '
NR == FNR {for (i=1; i<=NF; i++) F[i,NR] = $i
next
}
{T = 0
for (i=1; i<=NF; i++) T += ($i=="fail")?F[i,FNR]:0
print T
}
' file[12]
100
78
53
91
I don't think you need an END
section:
awk '
NR == FNR {for (i=1; i<=NF; i++) F[i,NR] = $i
next
}
{T = 0
for (i=1; i<=NF; i++) T += ($i=="fail")?F[i,FNR]:0
print T
}
' file[12]
100
78
53
91
edited Nov 8 at 17:14
answered Nov 8 at 14:19
RudiC
3,0811211
3,0811211
You are right, END section is redundant, +1.
– jimmij
Nov 8 at 15:48
add a comment |
You are right, END section is redundant, +1.
– jimmij
Nov 8 at 15:48
You are right, END section is redundant, +1.
– jimmij
Nov 8 at 15:48
You are right, END section is redundant, +1.
– jimmij
Nov 8 at 15:48
add a comment |
up vote
10
down vote
I would use a matrix language for such a task, e.g. GNU Octave.
Assuming you converted the pass/fail file into numerical values, e.g.:
sed 's/pass/1/g; s/fail/0/g' passfail > passfail.nums
You can now do the following:
marks = dlmread('marks');
passfail = dlmread('passfail.nums');
for i = 1:size(marks)(1)
sum(marks(i,:)(passfail(i,:) == 0))
end
Output:
ans = 100
ans = 78
ans = 53
ans = 91
add a comment |
up vote
10
down vote
I would use a matrix language for such a task, e.g. GNU Octave.
Assuming you converted the pass/fail file into numerical values, e.g.:
sed 's/pass/1/g; s/fail/0/g' passfail > passfail.nums
You can now do the following:
marks = dlmread('marks');
passfail = dlmread('passfail.nums');
for i = 1:size(marks)(1)
sum(marks(i,:)(passfail(i,:) == 0))
end
Output:
ans = 100
ans = 78
ans = 53
ans = 91
add a comment |
up vote
10
down vote
up vote
10
down vote
I would use a matrix language for such a task, e.g. GNU Octave.
Assuming you converted the pass/fail file into numerical values, e.g.:
sed 's/pass/1/g; s/fail/0/g' passfail > passfail.nums
You can now do the following:
marks = dlmread('marks');
passfail = dlmread('passfail.nums');
for i = 1:size(marks)(1)
sum(marks(i,:)(passfail(i,:) == 0))
end
Output:
ans = 100
ans = 78
ans = 53
ans = 91
I would use a matrix language for such a task, e.g. GNU Octave.
Assuming you converted the pass/fail file into numerical values, e.g.:
sed 's/pass/1/g; s/fail/0/g' passfail > passfail.nums
You can now do the following:
marks = dlmread('marks');
passfail = dlmread('passfail.nums');
for i = 1:size(marks)(1)
sum(marks(i,:)(passfail(i,:) == 0))
end
Output:
ans = 100
ans = 78
ans = 53
ans = 91
answered Nov 8 at 12:47
Thor
11.5k13358
11.5k13358
add a comment |
add a comment |
up vote
7
down vote
While I think using awk
is good for portability, other languages seem easier to write and read for this task. GNU Octave was mentioned but does not come pre-installed on most machines. On the other hand, most systems have a version of python preinstalled. Here is a python version:
for marks, decisions in zip(open('file1').readlines(), open('file2').readlines()):
row_score = 0
for mark, decision in zip(marks.split(), decisions.split()):
if decision == 'fail':
row_score += int(mark)
print(row_score)
which returns the outputs you expected.
add a comment |
up vote
7
down vote
While I think using awk
is good for portability, other languages seem easier to write and read for this task. GNU Octave was mentioned but does not come pre-installed on most machines. On the other hand, most systems have a version of python preinstalled. Here is a python version:
for marks, decisions in zip(open('file1').readlines(), open('file2').readlines()):
row_score = 0
for mark, decision in zip(marks.split(), decisions.split()):
if decision == 'fail':
row_score += int(mark)
print(row_score)
which returns the outputs you expected.
add a comment |
up vote
7
down vote
up vote
7
down vote
While I think using awk
is good for portability, other languages seem easier to write and read for this task. GNU Octave was mentioned but does not come pre-installed on most machines. On the other hand, most systems have a version of python preinstalled. Here is a python version:
for marks, decisions in zip(open('file1').readlines(), open('file2').readlines()):
row_score = 0
for mark, decision in zip(marks.split(), decisions.split()):
if decision == 'fail':
row_score += int(mark)
print(row_score)
which returns the outputs you expected.
While I think using awk
is good for portability, other languages seem easier to write and read for this task. GNU Octave was mentioned but does not come pre-installed on most machines. On the other hand, most systems have a version of python preinstalled. Here is a python version:
for marks, decisions in zip(open('file1').readlines(), open('file2').readlines()):
row_score = 0
for mark, decision in zip(marks.split(), decisions.split()):
if decision == 'fail':
row_score += int(mark)
print(row_score)
which returns the outputs you expected.
answered Nov 8 at 14:42
Maxim
1712
1712
add a comment |
add a comment |
up vote
6
down vote
Here is my awk
approach:
awk 'NR==FNR{for(i=1;i<=NF;i++) a[NR"-"i]=$i; next}
{for(j=1;j<=NF;j++) if($j=="fail") b[FNR]+=a[FNR"-"j]}
END{for(k in b) print b[k]}' file1 file2
Awk doesn't support two-dimensional arrays, so we cooked ones by combining two numbers (row and field) in the same array index. The output is:
100
78
53
91
add a comment |
up vote
6
down vote
Here is my awk
approach:
awk 'NR==FNR{for(i=1;i<=NF;i++) a[NR"-"i]=$i; next}
{for(j=1;j<=NF;j++) if($j=="fail") b[FNR]+=a[FNR"-"j]}
END{for(k in b) print b[k]}' file1 file2
Awk doesn't support two-dimensional arrays, so we cooked ones by combining two numbers (row and field) in the same array index. The output is:
100
78
53
91
add a comment |
up vote
6
down vote
up vote
6
down vote
Here is my awk
approach:
awk 'NR==FNR{for(i=1;i<=NF;i++) a[NR"-"i]=$i; next}
{for(j=1;j<=NF;j++) if($j=="fail") b[FNR]+=a[FNR"-"j]}
END{for(k in b) print b[k]}' file1 file2
Awk doesn't support two-dimensional arrays, so we cooked ones by combining two numbers (row and field) in the same array index. The output is:
100
78
53
91
Here is my awk
approach:
awk 'NR==FNR{for(i=1;i<=NF;i++) a[NR"-"i]=$i; next}
{for(j=1;j<=NF;j++) if($j=="fail") b[FNR]+=a[FNR"-"j]}
END{for(k in b) print b[k]}' file1 file2
Awk doesn't support two-dimensional arrays, so we cooked ones by combining two numbers (row and field) in the same array index. The output is:
100
78
53
91
edited Nov 8 at 15:44
answered Nov 8 at 8:55
jimmij
30.2k867102
30.2k867102
add a comment |
add a comment |
up vote
3
down vote
awk '
BEGIN{ pf=ARGV[2]; ARGV[2]="" }
{ getline l <pf; split(l, a); n=0;
for(i=1;i<=NF;i++) if(a[i]=="fail") n+=$i;
print n }
' file1 file2
100
78
53
91
Just like @Maxim's python version, but unlike all the other answers, this is processing the two files in parallel, line by line, instead of loading one of them whole into memory.
add a comment |
up vote
3
down vote
awk '
BEGIN{ pf=ARGV[2]; ARGV[2]="" }
{ getline l <pf; split(l, a); n=0;
for(i=1;i<=NF;i++) if(a[i]=="fail") n+=$i;
print n }
' file1 file2
100
78
53
91
Just like @Maxim's python version, but unlike all the other answers, this is processing the two files in parallel, line by line, instead of loading one of them whole into memory.
add a comment |
up vote
3
down vote
up vote
3
down vote
awk '
BEGIN{ pf=ARGV[2]; ARGV[2]="" }
{ getline l <pf; split(l, a); n=0;
for(i=1;i<=NF;i++) if(a[i]=="fail") n+=$i;
print n }
' file1 file2
100
78
53
91
Just like @Maxim's python version, but unlike all the other answers, this is processing the two files in parallel, line by line, instead of loading one of them whole into memory.
awk '
BEGIN{ pf=ARGV[2]; ARGV[2]="" }
{ getline l <pf; split(l, a); n=0;
for(i=1;i<=NF;i++) if(a[i]=="fail") n+=$i;
print n }
' file1 file2
100
78
53
91
Just like @Maxim's python version, but unlike all the other answers, this is processing the two files in parallel, line by line, instead of loading one of them whole into memory.
edited Nov 8 at 18:38
answered Nov 8 at 18:28
mosvy
4,313221
4,313221
add a comment |
add a comment |
up vote
2
down vote
I guess using an Awk
script would make this requirement a bit easy to solve. Do something like below. I guess its a bit slower than now posted jimmij's answer
#!/usr/bin/awk -f
FNR == NR {
for(i=1;i<=NF;i++)
if ( $i == "fail")
idxArray[FNR] = (idxArray[FNR]) ? (idxArray[FNR]" "i):(i)
next
}{
delete Array
delete Line
i=""
j=""
sum=""
n=split(idxArray[FNR],Array," ")
l=split($0,Line," ")
for (i=1;i<=n;i++)
for (j=1;j<=l;j++)
if (Array[i] == j )
sum += Line[j]
print sum
}
and run the script as
awk -f script.awk file2 file1
add a comment |
up vote
2
down vote
I guess using an Awk
script would make this requirement a bit easy to solve. Do something like below. I guess its a bit slower than now posted jimmij's answer
#!/usr/bin/awk -f
FNR == NR {
for(i=1;i<=NF;i++)
if ( $i == "fail")
idxArray[FNR] = (idxArray[FNR]) ? (idxArray[FNR]" "i):(i)
next
}{
delete Array
delete Line
i=""
j=""
sum=""
n=split(idxArray[FNR],Array," ")
l=split($0,Line," ")
for (i=1;i<=n;i++)
for (j=1;j<=l;j++)
if (Array[i] == j )
sum += Line[j]
print sum
}
and run the script as
awk -f script.awk file2 file1
add a comment |
up vote
2
down vote
up vote
2
down vote
I guess using an Awk
script would make this requirement a bit easy to solve. Do something like below. I guess its a bit slower than now posted jimmij's answer
#!/usr/bin/awk -f
FNR == NR {
for(i=1;i<=NF;i++)
if ( $i == "fail")
idxArray[FNR] = (idxArray[FNR]) ? (idxArray[FNR]" "i):(i)
next
}{
delete Array
delete Line
i=""
j=""
sum=""
n=split(idxArray[FNR],Array," ")
l=split($0,Line," ")
for (i=1;i<=n;i++)
for (j=1;j<=l;j++)
if (Array[i] == j )
sum += Line[j]
print sum
}
and run the script as
awk -f script.awk file2 file1
I guess using an Awk
script would make this requirement a bit easy to solve. Do something like below. I guess its a bit slower than now posted jimmij's answer
#!/usr/bin/awk -f
FNR == NR {
for(i=1;i<=NF;i++)
if ( $i == "fail")
idxArray[FNR] = (idxArray[FNR]) ? (idxArray[FNR]" "i):(i)
next
}{
delete Array
delete Line
i=""
j=""
sum=""
n=split(idxArray[FNR],Array," ")
l=split($0,Line," ")
for (i=1;i<=n;i++)
for (j=1;j<=l;j++)
if (Array[i] == j )
sum += Line[j]
print sum
}
and run the script as
awk -f script.awk file2 file1
edited Nov 8 at 9:31
answered Nov 8 at 9:09
Inian
3,755823
3,755823
add a comment |
add a comment |
up vote
0
down vote
One-liner:
paste file[12] | awk '{T=0; for (i=1; i<=NF/2; i++) T += ($(i+NF/2)=="fail")?$i:0; print T}'
100
78
53
91
add a comment |
up vote
0
down vote
One-liner:
paste file[12] | awk '{T=0; for (i=1; i<=NF/2; i++) T += ($(i+NF/2)=="fail")?$i:0; print T}'
100
78
53
91
add a comment |
up vote
0
down vote
up vote
0
down vote
One-liner:
paste file[12] | awk '{T=0; for (i=1; i<=NF/2; i++) T += ($(i+NF/2)=="fail")?$i:0; print T}'
100
78
53
91
One-liner:
paste file[12] | awk '{T=0; for (i=1; i<=NF/2; i++) T += ($(i+NF/2)=="fail")?$i:0; print T}'
100
78
53
91
answered Nov 9 at 8:47
RudiC
3,0811211
3,0811211
add a comment |
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f480520%2fhow-can-i-do-filtering-between-two-matrix%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
What is producing these two files and can't that program do this?
– Kusalananda
Nov 8 at 9:42