Run Template on the Cloud Dataflow service
up vote
1
down vote
favorite
I'm trying to running a local template I developed in Google DataFlow.
The problem is when I run it in Google Cloud Shell with:
python -m dataflow.py --project poc-cloud-209212 --temp_location gs://<...>
I got this error
/usr/bin/python: No module named apache_beam
So I tried with a simpler example: the wordcount
Like Google said, I execute:
python -m wordcount --input gs://dataflow-samples/shakespeare/kinglear.txt --output gs://<...> --runner DataflowRunner --project <project> --temp_location gs://<...>
And I got this error:
/usr/bin/python: No module named past.builtins
If I execute without .py:
python -m wordcount --input gs://dataflow-samples/shakespeare/kinglear.txt --output gs://<...> --runner DataflowRunner --project <project> --temp_location gs://<...>
Again, the same error, but with "more" informatión
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/<...>/wordcount.py", line 26, in <module>
from past.builtins import unicode
ImportError: No module named past.builtins
What is happening? How can I run those templates in Google Cloud Dataflow?
Do i need to set up the environment in Google Cloud like I did in local or is done by default?
python google-cloud-dataflow apache-beam
add a comment |
up vote
1
down vote
favorite
I'm trying to running a local template I developed in Google DataFlow.
The problem is when I run it in Google Cloud Shell with:
python -m dataflow.py --project poc-cloud-209212 --temp_location gs://<...>
I got this error
/usr/bin/python: No module named apache_beam
So I tried with a simpler example: the wordcount
Like Google said, I execute:
python -m wordcount --input gs://dataflow-samples/shakespeare/kinglear.txt --output gs://<...> --runner DataflowRunner --project <project> --temp_location gs://<...>
And I got this error:
/usr/bin/python: No module named past.builtins
If I execute without .py:
python -m wordcount --input gs://dataflow-samples/shakespeare/kinglear.txt --output gs://<...> --runner DataflowRunner --project <project> --temp_location gs://<...>
Again, the same error, but with "more" informatión
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/<...>/wordcount.py", line 26, in <module>
from past.builtins import unicode
ImportError: No module named past.builtins
What is happening? How can I run those templates in Google Cloud Dataflow?
Do i need to set up the environment in Google Cloud like I did in local or is done by default?
python google-cloud-dataflow apache-beam
I think you should specify thesetup.py
file in the root directory. Then pass it using --setup_file=/path/to/setup.py. The problem is that the workers don't see the imports. You can import locally within functions.
– GRS
Nov 8 at 10:35
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I'm trying to running a local template I developed in Google DataFlow.
The problem is when I run it in Google Cloud Shell with:
python -m dataflow.py --project poc-cloud-209212 --temp_location gs://<...>
I got this error
/usr/bin/python: No module named apache_beam
So I tried with a simpler example: the wordcount
Like Google said, I execute:
python -m wordcount --input gs://dataflow-samples/shakespeare/kinglear.txt --output gs://<...> --runner DataflowRunner --project <project> --temp_location gs://<...>
And I got this error:
/usr/bin/python: No module named past.builtins
If I execute without .py:
python -m wordcount --input gs://dataflow-samples/shakespeare/kinglear.txt --output gs://<...> --runner DataflowRunner --project <project> --temp_location gs://<...>
Again, the same error, but with "more" informatión
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/<...>/wordcount.py", line 26, in <module>
from past.builtins import unicode
ImportError: No module named past.builtins
What is happening? How can I run those templates in Google Cloud Dataflow?
Do i need to set up the environment in Google Cloud like I did in local or is done by default?
python google-cloud-dataflow apache-beam
I'm trying to running a local template I developed in Google DataFlow.
The problem is when I run it in Google Cloud Shell with:
python -m dataflow.py --project poc-cloud-209212 --temp_location gs://<...>
I got this error
/usr/bin/python: No module named apache_beam
So I tried with a simpler example: the wordcount
Like Google said, I execute:
python -m wordcount --input gs://dataflow-samples/shakespeare/kinglear.txt --output gs://<...> --runner DataflowRunner --project <project> --temp_location gs://<...>
And I got this error:
/usr/bin/python: No module named past.builtins
If I execute without .py:
python -m wordcount --input gs://dataflow-samples/shakespeare/kinglear.txt --output gs://<...> --runner DataflowRunner --project <project> --temp_location gs://<...>
Again, the same error, but with "more" informatión
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/<...>/wordcount.py", line 26, in <module>
from past.builtins import unicode
ImportError: No module named past.builtins
What is happening? How can I run those templates in Google Cloud Dataflow?
Do i need to set up the environment in Google Cloud like I did in local or is done by default?
python google-cloud-dataflow apache-beam
python google-cloud-dataflow apache-beam
edited Nov 8 at 10:32
asked Nov 8 at 10:27
IoT user
697
697
I think you should specify thesetup.py
file in the root directory. Then pass it using --setup_file=/path/to/setup.py. The problem is that the workers don't see the imports. You can import locally within functions.
– GRS
Nov 8 at 10:35
add a comment |
I think you should specify thesetup.py
file in the root directory. Then pass it using --setup_file=/path/to/setup.py. The problem is that the workers don't see the imports. You can import locally within functions.
– GRS
Nov 8 at 10:35
I think you should specify the
setup.py
file in the root directory. Then pass it using --setup_file=/path/to/setup.py. The problem is that the workers don't see the imports. You can import locally within functions.– GRS
Nov 8 at 10:35
I think you should specify the
setup.py
file in the root directory. Then pass it using --setup_file=/path/to/setup.py. The problem is that the workers don't see the imports. You can import locally within functions.– GRS
Nov 8 at 10:35
add a comment |
1 Answer
1
active
oldest
votes
up vote
1
down vote
accepted
Finally I did it.
This is how:
Install virtualenv with python 2.7 in Google Cloud(3.5 was installed by default and Dataflow can not use python3)
virtualenv env --python=python2
After activate this virtualenv you can run in it
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
Finally I did it.
This is how:
Install virtualenv with python 2.7 in Google Cloud(3.5 was installed by default and Dataflow can not use python3)
virtualenv env --python=python2
After activate this virtualenv you can run in it
add a comment |
up vote
1
down vote
accepted
Finally I did it.
This is how:
Install virtualenv with python 2.7 in Google Cloud(3.5 was installed by default and Dataflow can not use python3)
virtualenv env --python=python2
After activate this virtualenv you can run in it
add a comment |
up vote
1
down vote
accepted
up vote
1
down vote
accepted
Finally I did it.
This is how:
Install virtualenv with python 2.7 in Google Cloud(3.5 was installed by default and Dataflow can not use python3)
virtualenv env --python=python2
After activate this virtualenv you can run in it
Finally I did it.
This is how:
Install virtualenv with python 2.7 in Google Cloud(3.5 was installed by default and Dataflow can not use python3)
virtualenv env --python=python2
After activate this virtualenv you can run in it
edited 2 days ago
answered Nov 8 at 11:23
IoT user
697
697
add a comment |
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53205820%2frun-template-on-the-cloud-dataflow-service%23new-answer', 'question_page');
}
);
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
I think you should specify the
setup.py
file in the root directory. Then pass it using --setup_file=/path/to/setup.py. The problem is that the workers don't see the imports. You can import locally within functions.– GRS
Nov 8 at 10:35