Python - Web scraping - URL Protected by Kerberos HTTP SPNEGO
up vote
0
down vote
favorite
I guys,
I have a script that test if the URL is accessible or not. The script use the module requests and requests_kerberos.
For a specific Website - that gives me some detail about my Hadoop Cluster - that always return the same message:
HTTP Status: 401
401
I know that I have to make some configurations since this URL is protected by Kerberos.
For example, to have access using Firefox I had to follow these steps on Cloudera documentation:
https://www.cloudera.com/documentation/enterprise/5-5-x/topics/cdh_sg_browser_access_kerberos_protected_url.html
How can I apply this using Python script? There exists any module that allows me to pass this authentication issue?
Thanks!
python hadoop web-scraping kerberos cloudera
add a comment |
up vote
0
down vote
favorite
I guys,
I have a script that test if the URL is accessible or not. The script use the module requests and requests_kerberos.
For a specific Website - that gives me some detail about my Hadoop Cluster - that always return the same message:
HTTP Status: 401
401
I know that I have to make some configurations since this URL is protected by Kerberos.
For example, to have access using Firefox I had to follow these steps on Cloudera documentation:
https://www.cloudera.com/documentation/enterprise/5-5-x/topics/cdh_sg_browser_access_kerberos_protected_url.html
How can I apply this using Python script? There exists any module that allows me to pass this authentication issue?
Thanks!
python hadoop web-scraping kerberos cloudera
1
Google forpython SPNego
and have fun...
– Samson Scharfrichter
Nov 9 at 18:00
Save yourself a lot of heartache and usecurl
instead.
– tk421
Nov 10 at 6:09
with curl I get 'requests.exceptions.SSLError: HTTPSConnectionPool(host='HOSTNAME', port=8090): Max retries exceeded with url: /cluster/nodes (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1045)')))' as a error message
– Pedro Alves
Nov 12 at 14:30
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I guys,
I have a script that test if the URL is accessible or not. The script use the module requests and requests_kerberos.
For a specific Website - that gives me some detail about my Hadoop Cluster - that always return the same message:
HTTP Status: 401
401
I know that I have to make some configurations since this URL is protected by Kerberos.
For example, to have access using Firefox I had to follow these steps on Cloudera documentation:
https://www.cloudera.com/documentation/enterprise/5-5-x/topics/cdh_sg_browser_access_kerberos_protected_url.html
How can I apply this using Python script? There exists any module that allows me to pass this authentication issue?
Thanks!
python hadoop web-scraping kerberos cloudera
I guys,
I have a script that test if the URL is accessible or not. The script use the module requests and requests_kerberos.
For a specific Website - that gives me some detail about my Hadoop Cluster - that always return the same message:
HTTP Status: 401
401
I know that I have to make some configurations since this URL is protected by Kerberos.
For example, to have access using Firefox I had to follow these steps on Cloudera documentation:
https://www.cloudera.com/documentation/enterprise/5-5-x/topics/cdh_sg_browser_access_kerberos_protected_url.html
How can I apply this using Python script? There exists any module that allows me to pass this authentication issue?
Thanks!
python hadoop web-scraping kerberos cloudera
python hadoop web-scraping kerberos cloudera
asked Nov 9 at 16:21
Pedro Alves
134111
134111
1
Google forpython SPNego
and have fun...
– Samson Scharfrichter
Nov 9 at 18:00
Save yourself a lot of heartache and usecurl
instead.
– tk421
Nov 10 at 6:09
with curl I get 'requests.exceptions.SSLError: HTTPSConnectionPool(host='HOSTNAME', port=8090): Max retries exceeded with url: /cluster/nodes (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1045)')))' as a error message
– Pedro Alves
Nov 12 at 14:30
add a comment |
1
Google forpython SPNego
and have fun...
– Samson Scharfrichter
Nov 9 at 18:00
Save yourself a lot of heartache and usecurl
instead.
– tk421
Nov 10 at 6:09
with curl I get 'requests.exceptions.SSLError: HTTPSConnectionPool(host='HOSTNAME', port=8090): Max retries exceeded with url: /cluster/nodes (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1045)')))' as a error message
– Pedro Alves
Nov 12 at 14:30
1
1
Google for
python SPNego
and have fun...– Samson Scharfrichter
Nov 9 at 18:00
Google for
python SPNego
and have fun...– Samson Scharfrichter
Nov 9 at 18:00
Save yourself a lot of heartache and use
curl
instead.– tk421
Nov 10 at 6:09
Save yourself a lot of heartache and use
curl
instead.– tk421
Nov 10 at 6:09
with curl I get 'requests.exceptions.SSLError: HTTPSConnectionPool(host='HOSTNAME', port=8090): Max retries exceeded with url: /cluster/nodes (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1045)')))' as a error message
– Pedro Alves
Nov 12 at 14:30
with curl I get 'requests.exceptions.SSLError: HTTPSConnectionPool(host='HOSTNAME', port=8090): Max retries exceeded with url: /cluster/nodes (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1045)')))' as a error message
– Pedro Alves
Nov 12 at 14:30
add a comment |
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53229537%2fpython-web-scraping-url-protected-by-kerberos-http-spnego%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Google for
python SPNego
and have fun...– Samson Scharfrichter
Nov 9 at 18:00
Save yourself a lot of heartache and use
curl
instead.– tk421
Nov 10 at 6:09
with curl I get 'requests.exceptions.SSLError: HTTPSConnectionPool(host='HOSTNAME', port=8090): Max retries exceeded with url: /cluster/nodes (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1045)')))' as a error message
– Pedro Alves
Nov 12 at 14:30