
简介
背景
程序最新地址:SRCHunter一款基于python的开源扫描器
有网友反馈信息收集的脚本有bug,debug下
实现核心思想
尽可能的考虑每一种情况 | 处理每一个异常并捕获 | 每个变量初始化 | 随机无序验证
主要是两种功能的实现:
- 快速探测不扫描
- 敏感资产扫描
基于py2.7
考虑到一些网站证书的问题:pip install requests[security]
Download plugins portscan.py: http://www.cnnetarmy.com/soft/portscan.py
程序使用
webscan not scanDir 快速探测:python webmain.py -f vuln_domains.txt
webscan Portscan && scanDir 敏感资产扫描:python webmain.py -d vuln_domains.txt
结果保存到:vuln_domains_datetime.html
目标站点信息收集(-F)
目的:不触发waf的情况下,收集目标站点信息
report函数
扫描报告生成,使用%flag%_日期.html命名
1
2
3
4
5
6
|
def report():
'''
Report result to target_time.html
'''
output_file = sys.argv[2].split('.')[0] + time.strftime('%Y-%m-%d',time.localtime(time.time()))+'.html'
return output_file
|
requests_headers函数
使用random模块,实现随机User-Agent请求
更改‘Change Me !!!’为登录后的cookie,可实现带cookie请求扫,适合单点登录的情况
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
def requests_headers():
'''
Random UA for every requests && Use cookie to scan
'''
cookie = 'Change Me !!!'
user_agent = ['Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.8.1) Gecko/20061010 Firefox/2.0',
...
'Mozilla/5.0 (Windows; U; Windows NT 6.0; fr-FR) AppleWebKit/533.18.1 (KHTML, like Gecko) Version/5.0.2 Safari/533.18.5']
UA = random.choice(user_agent)
headers = {
'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'User-Agent':UA,'Upgrade-Insecure-Requests':'1','Connection':'keep-alive','Cache-Control':'max-age=0',
'Accept-Encoding':'gzip, deflate, sdch','Accept-Language':'zh-CN,zh;q=0.8','Cookie':cookie}
return headers
|
requests_proxies函数
请求代理配置,默认关闭,如果用到程序的辅助函数查旁站,配置扶墙,直接开启127.0.0.1:1080,见注释
注释中的另外一个127.0.0.1:8080,用途是让程序走本地BurpSuite,方便某些请求响应的分析
1
2
3
4
5
6
7
8
9
|
def requests_proxies():
'''
Proxies for every requests
'''
proxies = {
'http':'',# 127.0.0.1:1080 shadowsocks
'https':''#127.0.0.1:8080 BurpSuite
}
return proxies
|
url2ip函数
把url转换成ip,该函数遵循程序核心思想
1
2
3
4
5
6
7
8
9
10
11
12
|
def url2ip(url):
'''
Url to ip
'''
ip = ''
try:
handel_url = urlparse.urlparse(url).hostname
ip = socket.gethostbyname(handel_url)
except:
print '[!] Can not get ip'
pass
return ip
|
portscan函数
加载portscan.py,进行全端口扫描
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
def portscan(ip):
'''
Scan open port | all ports
'''
open_ports = []
try:
m = __import__('portscan')
p = m.Work(scan_target = ip)
open_ports = p.run()
except:
print '[*] Need load portscan.py plugin'
print '[*] Download from: http://www.cnnetarmy.com/soft/portscan.py'
pass
if len(open_ports) > 100:
print '[!] Maybe got waf'
return open_ports
return open_ports
|
getitle函数
获取网站标题,并且返回标题,状态码,页面大小
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
|
def getitle(url):
'''
Get title,status_code,content_lenth
'''
headers = requests_headers()
proxies = requests_proxies()
if '://' not in url:
url = 'http://' + url
if ':443' in url:
url = 'https://' + url.replace(':443','').replace('http://','')
title = ''
code = ''
lenth = ''
try:
req = requests.get(url = url, headers = headers, proxies = proxies,verify = False,timeout = 3)
code = req.status_code
lenth = len(req.content)
if code in range(200,405) and len(req.content) != 0:
title = re.findall(r'<title>(.*?)</title>',req.content)[0]
except:pass#ignore Exception
return title,code,lenth
|
checkFast函数
快速探测主函数,实现url转ip后,进行全端口扫描,并且对每一个开放的端口进行getitle信息探测,小细节是加了个filter_ports,过滤掉常规端口
考虑到一种情况,多个url解析到一个ip上,程序加了filter_ips过滤已经扫描过全端口的ip
程序还考虑到了一种情况,即:当url传过来的是alive,而且没有扫描出开放的端口,程序会自动探测80端口的信息,效果如图:
debug日志:
目标站点敏感信息扫描(-D)
目的:加入端口扫描和敏感文件,目录扫描
base64str函数
返回随机字符串,用于判断404页面,该函数实现程序核心思想中的随机无序验证
1
2
3
4
5
6
|
def base64str():
'''
Return Random base64 string
'''
key = random.random() * 10 # Handle "0." ->"/MC4" Character
return base64.b64encode(str(key)).replace('=','')
|
getMonth函数
生成前一个月的备份文件,格式仅生成最常见的20180101[.zip|..],用于敏感备份文件扫描,具体扫描多少天,可以按需更改
1
2
3
4
5
6
7
8
9
10
11
12
13
|
def getMonth():
'''
Return 3x days ago backup file. eg:20171228.rar
'''
month = []
monPayload = ["/%test%.7z","/%test%.rar","/%test%.zip","/%test%.tar.gz"]
for mon in monPayload:
for i in range(3):
i = (datetime.datetime.now() - datetime.timedelta(days = i))
flag = i.strftime('%Y%m%d')
flag = mon.replace('%test%',flag)
month.append(flag)
return month
|
dirscan函数
敏感目录扫描主函数,附上常用探测payloads集合,加上生成的日期备份文件,程序会先判断404页面的状态,然后对payloads进行遍历探测
判断条件是返回200,且返回内容大小不等于0,且payload与404界面的返回大小差的绝对值大于5(或者直接两者返回大小不相等)
附加判断条件是,如果碰到waf,或者各种非预期的情况,导致跑的payloads返回大于40,这种情况程序会提示有可能碰到waf,并且返回空
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
|
def dirscan(url):
'''
Webdir weakfile scan
'''
dirs = []
headers = requests_headers()
proxies = requests_proxies()
hostbak = urlparse.urlparse(url).hostname
month = getMonth()
random_str = base64str()
payloads = ["/robots.txt","/README.md","/crossdomain.xml","/.git/config","/.svn/entries","/.svn/wc.db","/.DS_Store","/CVS/Root","/CVS/Entries","/.idea/workspace.xml"]
payloads += ["/index.htm","/index.html","/index.php","/index.asp","/index.aspx","/index.jsp","/index.do","/index.action"]
payloads += ["/www/","/console","/web-console","/web_console","/jmx-console","/jmx_console","/JMXInvokerServlet","/invoker","/phpinfo.php","/info.php"]
payloads += ["/index.bak","/index.swp","/index.old","/.viminfo","/.bash_history","/.bashrc","/project.properties","/config.properties","/config.inc","/common.inc","/db_mysql.inc","/install.inc","/conf.inc","/db.inc","/setup.inc","/init.inc","/config.ini","/php.ini","/info.ini","/setup.ini","/www.ini","/http.ini","/conf.ini","/core.config.ini","/ftp.ini","/data.mdb","/db.mdb","/test.mdb","/database.mdb","/Database.mdf","/BookStore.mdf","/DB.mdf","/1.sql","/install.sql","/schema.sql","/mysql.sql","/dump.sql","/users.sql","/update.sql","/test.sql","/user.sql","/database.sql","/sql.sql","/setup.sql","/init.sql","/login.sql","/backup.sql","/all.sql","/passwd.sql","/init_db.sql","/fckstyles.xml","/Config.xml","/conf.xml","/build.xml","/web.xml","/test.xml","/ini.xml","/www.xml","/db.xml","/database.xml","/admin.xml","/login.xml","/sql.xml","/sample.xml","/settings.xml","/setting.xml","/info.xml","/install.xml","/Php.xml","/.mysql_history"]
payloads += ["/nginx.conf","/httpd.conf","/test.conf","/conf.conf","/local.conf","/user.txt","/LICENSE.txt","/sitemap.xml","/username.txt","/pass.txt","/passwd.txt","/password.txt","/.htaccess","/web.config","/app.config","/log.txt","/config.xml","/CHANGELOG.txt","/INSTALL.txt","/error.log"]
payloads += ["/login","/phpmyadmin","/pma","/pmd","/SiteServer","/admin","/Admin/","/manage","/manager","/manage/html","/resin-admin","/resin-doc","/axis2-admin","/admin-console","/system","/wp-admin","/uc_server","/debug","/Conf","/webmail","/service","/ewebeditor"]
payloads += ["/xmlrpc.php","/search.php","/install.php","/admin.php","/login.php","/l.php","/forum.php"]
payloads += ["/portal","/blog","/bbs","/webapp","/webapps","/plugins","/cgi-bin","/htdocs","/wsdl","/html","/install","/test","/tmp","/file","/solr/#/","/WEB-INF","/zabbix","/backup","/log"]
payloads += ["/www.7z","/www.rar","/www.zip","/www.tar.gz","/wwwroot.zip","/wwwroot.rar","/wwwroot.7z","/wwwroot.tar.gz","/%flag%.7z","/%flag%.rar","/%flag%.zip","/%flag%.tar.gz","/backup.7z","/backup.rar","/backup.tar","/backup.tar.gz","/backup.zip","/index.7z","/index.rar","/index.sql","/index.tar","/index.tar.gz","/index.zip"]
payloads += month
if url[-1:] == '/':
url = url[:-1]
try:
check_url = url + '/' + str(random_str)#'/Wo4N1Dx1aoKeI'
print '[*] Now is check waf: ' + check_url
check_waf = requests.get(url=check_url,proxies=proxies,verify=False,headers=headers,timeout=5)
for payload in payloads:
try:
payload = payload.replace('%flag%',hostbak)
req = requests.get(url=url+payload,proxies=proxies,verify=False,headers=headers,timeout=5)
#req = requests.head(url + payload)
if req.status_code == 200 and abs(len(req.content) - len(check_waf.content)) > 5 and len(req.content) != 0:
print '[+] Get %s%s 200 %s' % (url,payload,len(req.content))
dirs.append(payload)
except Exception,e:
print e
pass
except Exception,e:
print e
pass
if len(dirs) > 40:
print '[*] Maybe Got waf.'
return '[]'
else:
return dirs
|
checkDir函数
针对扫描出来的开放端口,进行fuzz探测,类似快速探测函数,只是增加了敏感文件、目录扫描模块,效果如图:
debug日志:
辅助函数
这些在扫描器中抠出来的可能用到的模块
getallink函数
爬取网页中的超链接
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
|
def getallink(url):
'''
Get response all link
'''
headers = requests_headers()
proxies = requests_proxies()
links = []
tags = ['a','A','link','script','area','iframe','form']#img
tos = ['href','src','action']
try:
req = requests.get(url=url,proxies=proxies,verify=False,headers=headers,timeout=3)
#if req.status_code in range(300,310):
# return req.url
if 'location.href="' in req.content:
url = url + re.findall(r'location.href="(.*?)";',req.content)[0].replace(url,'').replace(urlparse.urlparse(url).hostname,'')
for tag in tags:
for to in tos:
link = re.findall(r'<%s.*?%s="(.*?)"'%(tag,to),str(req.content))
for i in link:#filter
if i not in links and '.png' not in i and 'javascript' not in i and '.svg' not in i and '.jpg' not in i and '.js' not in i and '.css' not in i and '/css?' not in i and '.gif' not in i and '.jpeg' not in i and '.ico' not in i and '.swf' not in i and '.mpg' not in i:
links.append(i)
except Exception,e:
print e
pass
return links
|
email匹配函数
匹配返回页面中可能存在的email
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
def email_regex(raw):
'''
Collect email
test#cnnetarmy.com | Admin01@cnnetarmy.com.cn | san.Zhang@cnnetarmy.com | si01.Li@cnnetarmy.com | zhaowu01@cnnetarmy.com
'''
emails = []
regex = '[-_\w\.]{0,64}\@[-_\w\.]{0,64}\.{1,2}[-_\w\.]{0,64}'
regex_one = '([\w-]+@[\w-]+\.[\w-]+)+'
regex_two = '[-_\w\.]{0,64}[@#][-_\w\.]{0,64}\.{1,2}[-_\w\.]{0,64}'
regex_three = "[\w!#$%&'*+/=?^_
{|}~-]+(?:\.[\w!#$%&'*+/=?^_ {|}~-]+)*[@#](?:[\w](?:[\w-]*[\w])?\.)+[\w](?:[\w-]*[\w])" regex_four = '\w[-\w.+]*[@#]([A-Za-z0-9][-A-Za-z0-9]+\.)+[A-Za-z]{2,14}'
mailto = r'mailto:(.*?)"'
try:
emails = re.findall(regex_three,str(raw))
except Exception,e:
print e
pass
return emails
|
url匹配函数
匹配返回页面中存在的url
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
def url_regex(raw):
'''
Collect url
'''
urls = []
regex = '[a-zA-z]+://[^\s]*'
regex_one = r"\b(http://(\d{1,3}\.){3}\d{1,3}(:\d+)?)\b"
regex_two = r"((?:https?|ftp|file):\/\/[\-A-Za-z0-9+&@#/%?=~_|!:,.;\*]+[\-A-Za-z0-9+&@#/%=~_|])"
try:
urls = re.findall(regex_two,str(raw))
except Exception,e:
print e
pass
return urls
|
ip匹配函数
匹配返回页面中可能存在的ip
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
def ip_regex(raw):
'''
Collect ip
1.1.1.1 | 10.1.1.1 | 256.10.1.256 | 222.212.22.11
'''
ips = []
regex = '((2[0-4]\d|25[0-5]|[01]?\d\d?)\.){3}(2[0-4]\d|25[0-5]|[01]?\d\d?)'
regex_one = '[1-9]{1,3}\.[1-9]{1,3}\.[1-9]{1,3}\.[1-9]{1,3}'
try:
ips = re.findall(regex,str(raw))
except Exception,e:
print e
pass
return ips
|
C段生成函数
生成C段ip地址池
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
|
def c_duan(ip):
'''
Collect ip C.x
'''
ip_list = []
try:
ip_split = ip.split('.')
for c in xrange(1,255):
ip = "%s.%s.%s.%d" % (ip_split[0],ip_split[1],ip_split[2],c)
ip_list.append(ip)
open_ports = portscan(ip)
print ip,open_ports
except Exception,e:
print e
pass
return ip_list
|
SameIpDomain(旁站)函数
查询同ip站点,需要使用proxies配合ss使用
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
|
def SameIpDomain(ip):
'''
Same Ip Domains
https://www.bing.com/search?q=IP:43.242.128.230&ensearch=1
Yujian 2014
SameIpDomain = ["69116912.com","allensnote.com","baidudaili.net","cbbteam.com","cnnetarmy.com","howeal.com","manbajs.com","njchao.com","nxrtts.com","ourjob.it","shuadanla.com","sijiyoumei.net","wiliu.com","woobian.com","www.cnnetarmy.com","www.shuadanla.com","www.xitongbashi.com","xitongbashi.com","yanghe56.com","yuxith.com"]
https://www.tcpiputils.com/reverse-ip/43.242.128.230
'''
SameIpDomain = []
headers = requests_headers()
proxies = requests_proxies()
if str(proxies) == "{'http': '', 'https': ''}":
print 'Host api.hackertarget.com need use proxies'
return SameIpDomain
else:
try:
api = 'http://api.hackertarget.com/reverseiplookup/?q={}'.format(ip)#43.242.128.230
req = requests.get(url=api,headers=headers,proxies=proxies,timeout=5,verify = False)
keys = req.content.split('\n')
for key in keys:
if key not in SameIpDomain:
SameIpDomain.append(key)
print '[+] Get SameIpDomainList: ' + key
except Exception,e:
print e
pass
return SameIpDomain
|
whois函数
依赖whois模块,查询域名注册信息
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
def whois(url):
'''
Get whois
'''
api = 'http://whois.chinaz.com/www.cnnetarmy.com'
api_one = 'https://x.threatbook.cn/domain/www.cnnetarmy.com'
whois_result = []
try:
import whois
domain_whois = whois.whois(url)#"http://www.cnnetarmy.com"
whois_result = json.loads(str(domain_whois))
except:
print 'pip install whois'
pass
return whois_result
|
ipaddr函数
获取ip的真实地址,使用taobao 接口
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
def ipaddr(ip):
'''
Get ip addr info
'''
headers = requests_headers()
proxies = requests_proxies()
ip_data = {}
try:
api_url = 'http://ip.taobao.com/service/getIpInfo.php?ip={}'.format(ip)
#api_one = 'https://api.shodan.io/shodan/host/218.196.240.8'
req = requests.get(url = api_url, headers = headers, proxies = proxies,verify = False,timeout = 5)
local_ip = json.loads(req.content)
ip_data = local_ip['data']
except Exception,e:
print e
pass
return ip_data
|
baidu_site函数
site:cnnetarmy.com,获取子域名列表
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
|
def baidu_site(url):
'''
Baidu site
'''
if '://' in url:
url = urlparse.urlparse(url).hostname
baidu_url = 'https://www.baidu.com/s?ie=UTF-8&wd=site:{}'.format(url)
headers = requests_headers()
proxies = requests_proxies()
try:
r = requests.get(url = baidu_url, headers = headers, proxies = proxies,verify = False,timeout = 5).content
if 'class=\"nors\"' not in r:
#return '<a href="%s" target=_blank />Baidu_site</a>' % baidu_url
domains = []
for i in xrange(0,100):#max page_number
pn = i * 10
newurl = 'https://www.baidu.com/s?ie=UTF-8&wd=site:{}&pn={}&oq=site:{}'.format(url,pn,url)
keys = requests.get(url = newurl, headers = headers, proxies = proxies,verify = False,timeout = 5).content
flag = re.findall(r'style=\"text-decoration:none;\">(.*?)<\/a><div class=\"c-tools\"',keys)
for j in flag:
domain = j.split('.')[0]
domain_handle = domain.replace('https://','').replace('http://','')
if domain_handle not in domains:
print domain_handle
domains.append(domain_handle)
return domains
else:
return ''
except Exception,e:
print e
pass
return ''
|
whatcms函数
whatweb.bugscaner.com接口
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
|
def whatcms(url):
'''
Cms type identify
Handle BugScan whatcms.py to requests only leave common cms type data
To do load xxxcms payload
api_one = 'http://whatweb.bugscaner.com/look/'
'''
headers = requests_headers()
proxies = requests_proxies()
try:
s = requests.Session()
r = s.get(url='http://whatweb.bugscaner.com/look/',headers = headers, proxies = proxies,verify = False,timeout = 5).content
hash_r = re.findall(r'<input type="hidden" value="(.*?)" name="hash" id="hash">',str(r))[0]
url_handle = url.replace(':8080','').replace(':80','')
if '://' in url:
url_handle = url.split('://')[1].replace("/",'')
data = "url={}&hash={}".format(url_handle,hash_r)
key = s.post(url='http://whatweb.bugscaner.com/what/',headers = headers, proxies = proxies,verify = False,timeout = 5).content
result = json.loads(key)
if len(result["cms"]) > 0:
return result["cms"]
else:
return 'www'
except:
return 'www'
pass
|
代码下载
无报错模式:http://www.cnnetarmy.com/soft/webmain.py
debug模式:http://www.cnnetarmy.com/soft/webmain_debug.py
portscan:http://www.cnnetarmy.com/soft/portscan.py
直接ctrl+s保存即可
Debug
反馈qq:2069698797