Jump to content

[Solved] Strange work python on Ricky


trbot

Recommended Posts

Hi! 
some days ago everything was good and my script working well

here it is:

 

 

#!/usr/bin/python3.6

print("Content-Type: text/html\n\n")
import requests
response = requests.get('https://www.ya.ru')
print(response.url + response.text)

 

the problem is in requests module it's just not work anymore!  I get only   Content-Type: text/html and thats's all

 

can you fix it please?

 

I run it in cgi-bin dir with 0775 rights

Edited by trbot
Link to comment
Share on other sites

It has nothing to do with our servers. It has to do with https://www.ya.ru/ Most websites don't like to be scraped so most likely they noticed you scraping their website and implemented some mechanism to block you. Try using requests to scrape some other website and you'll see that it's working just fine. If a website is going to these lengths to block you perhaps you should just leave them alone, but if you insist on scraping them anyways you might find using a public proxy will get around their block.

Link to comment
Share on other sites

It has nothing to do with our servers. It has to do with https://www.ya.ru/ Most websites don't like to be scraped so most likely they noticed you scraping their website and implemented some mechanism to block you. Try using requests to scrape some other website and you'll see that it's working just fine. If a website is going to these lengths to block you perhaps you should just leave them alone, but if you insist on scraping them anyways you might find using a public proxy will get around their block.

Before staring this topic I'm trying everything
 
First time it works well and I used https://www.alphavantage.co with python and it just can't ban coz it give stock info for everyone!
 
 
 
OK now I create PHP script with this lines on your server:

 

 
echo  file_get_contents('http://www.ya.ru/');

 

and it works!

 

in python:

 

#!/usr/bin/python3.6
 
print("Content-Type: text/html\n\n")

 

 

import requests
 
response = requests.get('http://www.ya.ru')
print(response.url + response.text)

 

print(response2.url + response2.text)

 

and it NOT working!!! I see only Content-Type: text/html

and it just create json.cpython-36.pyc in __pycache__ dir and nothing happens

 

So how you can explain that in PHP it work fine but not working in python?

There is no any ban 100% I'm tryed a lot of different sites and nothing!

it's something wrong with requests module in python

Edited by trbot
Link to comment
Share on other sites

I've tested requests on python 3.6 on Ricky and I can't find anything wrong with it. Try this code:

 

#!/usr/bin/python3.6
 
import requests
 
print("Content-Type: text/html\n\n")
 
response = requests.get('https://www.heliohost.org/ip.php')
print(response.url + response.text)
Working example on Ricky https://krydos1.heliohost.org/cgi-bin/req.py

 

As far as the content-type header being printed twice, that might mean you have a file name conflict. For instance if you named a file requests.py and then did import requests it would import the file causing the header to be printed twice.

Link to comment
Share on other sites

I've tested requests on python 3.6 on Ricky and I can't find anything wrong with it. Try this code:

 

#!/usr/bin/python3.6
 
import requests
 
print("Content-Type: text/html\n\n")
 
response = requests.get('https://www.heliohost.org/ip.php')
print(response.url + response.text)
Working example on Ricky https://krydos1.heliohost.org/cgi-bin/req.py

 

As far as the content-type header being printed twice, that might mean you have a file name conflict. For instance if you named a file requests.py and then did import requests it would import the file causing the header to be printed twice.

 

Finally I find out  :wacko: 

Python is so weird...

 

If I call file re.py and (requests inside) it will not work if I rename file in another name it will works!

but if I have file called json.py in same dir requests module will not works at any py files!  :blink: 

so my conclusion don't call filenames like json.py or re.py !!!

 

Json.py  & Re.py will be ok.

 

Can you tell me is it possible to configure .htaccess file or some another cfg same way that when I open dir with index.py inside it

I can directly open a base dir like this:

http://host/cgi-bin/test/  - ->> http://host/cgi-bin/test/index.py

 

like it works with index.php/index.html files

 

I'm newbie in Python so I making my first steps...

Link to comment
Share on other sites

Create a /home/trbot/public_html/.htaccess file and put these contents

Options +ExecCGI
AddHandler cgi-script .py
DirectoryIndex index.py
The first two lines make .py files executable outside cgi-bin, and the last line makes the filename index.py show up if someone goes to domain.heliohost.org without having to type out the filename like domain.heliohost.org/index.py
Link to comment
Share on other sites

Create a /home/trbot/public_html/.htaccess file and put these contents

Options +ExecCGI
AddHandler cgi-script .py
DirectoryIndex index.py

The first two lines make .py files executable outside cgi-bin, and the last line makes the filename index.py show up if someone goes to domain.heliohost.org without having to type out the filename like domain.heliohost.org/index.py

Tnx!

is it possible to make index.php and index.py config that way if index.py absent it will not show current dir content

and try open index.php and if it also absent show 404 ?

Edited by trbot
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...