idhyt: python challenge 0-10

最近闲下来了四处看资料，然后找到了这个python challenge网页闯关游戏。
这个游戏需要通过一些提示找出下一关的网页地址，链接：http://www.pythonchallenge.com/
题目出的也是充满想象力，斗智斗勇中。。。
解题代码整理在我的 github

python challenge 0

http://www.pythonchallenge.com/pc/def/0.html

# 2的38次幂
def calc():
    return 2 ** 38
>>> 274877906944L

python challenge 1

http://www.pythonchallenge.com/pc/def/274877906944.html

通过图片可知道，所有字符右移两位即可。

def map_():
    first_alpha = chr(ord("m") + 2)
    second_alpha = chr(ord("a") + 2)
    third_alpha = chr(ord("p") + 2)
    return "".join([first_alpha, second_alpha, third_alpha])
>>> ocr

python challenge 2

http://www.pythonchallenge.com/pc/def/ocr.html

查找出现次数最少的字符，先是统计出所有字符出现的次数，然后获取出现次数最早的，要注意按照原来顺序排列。

def ocr():
    stats = {}
    req = requests.get("http://www.pythonchallenge.com/pc/def/ocr.html")
    page_source = req.text
    mess_pattern = re.compile(r"(%%[\s\S]+)-->", re.IGNORECASE)
    mess_content = mess_pattern.findall(page_source)[0]
    # print mess_content

    for character in mess_content:
        if character in stats:
            stats[character] += 1
        else:
            stats[character] = 1
    for key_, value_ in stats.items():
        print "%s -> %d" % (key_, value_)

    # a e i l q u t y
    return "".join(re.findall(r"[a-z]+", mess_content))
>>> u'equality'

python challenge 3

http://www.pythonchallenge.com/pc/def/equality.html

小写字符两边有三个大写字母,格式为aAAAaAAAa
AAAaAAA这种匹配是错误的，要保证两边都是三位大写字母

def equality():
    req = requests.get("http://www.pythonchallenge.com/pc/def/equality.html")
    page_source = req.text
    mess_pattern = re.compile(r"kAewtloYgc[\s\S]+", re.IGNORECASE)
    mess_content = mess_pattern.findall(page_source)[0]
    return "".join(re.findall(r"[a-z]{1}[A-Z]{3}([a-z]{1})[A-Z]{3}[a-z]{1}", mess_content))
>>> u'linkedlist'

python challenge 4

http://www.pythonchallenge.com/pc/def/linkedlist.php

点击图片会有提示,然后更换链接参数继续,大概200多次即可得到答案。
其中正则不能只写"[0-9]+“,中间有陷阱：

There maybe misleading numbers in the text. One example is 82683\. Look only for the next nothing and the next nothing is 63579

完整代码如下：

def linkedlist(times=1, param="123456"):
    try:
        url = "?nothing=".join(["http://www.pythonchallenge.com/pc/def/linkedlist.php", param])
        page_source = requests.get(url).text
        next_param = re.findall("and the next nothing is ([0-9]+)", page_source)[0]
        print "%d -> %s" % (times, next_param)
        times += 1
        linkedlist(times, next_param)
    # 匹配不到跳到异常
    except IndexError:
        print "the right url is : %s" % url
        return page_source
>>> the right url is : http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing=66831
    peak.html

python challenge 5

http://www.pythonchallenge.com/pc/def/peak.html

这关估计卡了不少人，从页面源码只能得到banner.p文件地址,即http://www.pythonchallenge.com/pc/def/banner.p
跟我一样没有用过pickle模块的一般都解不出来，看到文件内容也是没有头绪的，通过上网查了资料才知道如何解题。

# pickle模块的使用
def peak():
    import pickle
    page_source = requests.get("http://www.pythonchallenge.com/pc/def/peak.html").text
    banner_name = re.findall(r"", page_source)[0]
    banner_url = "".join(["http://www.pythonchallenge.com/pc/def/", banner_name])
    banner_content = requests.get(banner_url).text
    # print banner_content
    banner_obj = pickle.loads(banner_content)
    for list_ in banner_obj:
        print "".join([tuple_[0] * tuple_[1] for tuple_ in list_])

>>>
              #####                                                                      ##### 
               ####                                                                       #### 
               ####                                                                       #### 
               ####                                                                       #### 
               ####                                                                       #### 
               ####                                                                       #### 
               ####                                                                       #### 
               ####                                                                       #### 
      ###      ####   ###         ###       #####   ###    #####   ###          ###       #### 
   ###   ##    #### #######     ##  ###      #### #######   #### #######     ###  ###     #### 
  ###     ###  #####    ####   ###   ####    #####    ####  #####    ####   ###     ###   #### 
 ###           ####     ####   ###    ###    ####     ####  ####     ####  ###      ####  #### 
 ###           ####     ####          ###    ####     ####  ####     ####  ###       ###  #### 
####           ####     ####     ##   ###    ####     ####  ####     #### ####       ###  #### 
####           ####     ####   ##########    ####     ####  ####     #### ##############  #### 
####           ####     ####  ###    ####    ####     ####  ####     #### ####            #### 
####           ####     #### ####     ###    ####     ####  ####     #### ####            #### 
 ###           ####     #### ####     ###    ####     ####  ####     ####  ###            #### 
  ###      ##  ####     ####  ###    ####    ####     ####  ####     ####   ###      ##   #### 
   ###    ##   ####     ####   ###########   ####     ####  ####     ####    ###    ##    #### 
      ###     ######    #####    ##    #### ######    ###########    #####      ###      ######

http://www.pythonchallenge.com/pc/def/channel.html

一张拉链图片，看了源码也没有任何提示，只知道是要捐赠。。。
再仔细看第一行，，将html变zip
下载channel.zip文件，链接：http://www.pythonchallenge.com/pc/def/channel.zip
下载完成后解压，和第四题一样，只不过多了文件遍历操作，遍历从任意文件开始都可以。
我这里从os.walk遍历到的文件列表中第一个文件开始遍历。

def channel():
    import os

    root_path = "D:\\downloads\\channel"

    def linked_file(times=1, file_name="123456"):
        try:
            file_path = "\\".join([root_path, file_name])
            file_content = open(file_path, "r+").read()
            next_file_name = re.findall("Next nothing is ([0-9]+)", file_content)[0]
            print "%d -> %s.txt" % (times, next_file_name)
            times += 1
            linked_file(times, "".join([next_file_name, ".txt"]))
        # 匹配不到跳到异常
        except IndexError:
            print "the right link in file : %s" % file_name
            print file_content

    for root, dirs, files in os.walk(root_path):
        for file_ in files:
            linked_file(1, file_)
            break
>>> the right link in file : 46145.txt
    Collect the comments.

坑爹的答案，让我收集注释，那么答案在哪里。。。
再仔细翻解压的文件，发现里边有个readme.txt !!!

welcome to my zipped list.
hint1: start from 90052
hint2: answer is inside the zip

将代码中linkedfile(1, file)改为linked_file(1, "90052.txt”)
最后输出答案一样！！！

难道是我的解压方式不对？！？！
调用zipfile模块,使用ZipInfo类获取comment

def channel():
    import zipfile
    zf = zipfile.ZipFile("D:\\downloads\\channel.zip")
    result = []
    begin_file_name = "90052.txt"
    while True:
        z_info = zf.getinfo(begin_file_name)
        f = zf.open(z_info)
        f_content = f.read()
        next_file_name = re.findall(r"Next nothing is ([0-9]+)", f_content)
        f.close()
        result.append(z_info.comment)
        if next_file_name:
            begin_file_name = "%s.txt" % next_file_name[0]
        else:
            break

    print "".join(result)
>>>
****************************************************************
****************************************************************
**                                                            **
**   OO    OO    XX      YYYY    GG    GG  EEEEEE NN      NN  **
**   OO    OO  XXXXXX   YYYYYY   GG   GG   EEEEEE  NN    NN   **
**   OO    OO XXX  XXX YYY   YY  GG GG     EE       NN  NN    **
**   OOOOOOOO XX    XX YY        GGG       EEEEE     NNNN     **
**   OOOOOOOO XX    XX YY        GGG       EEEEE      NN      **
**   OO    OO XXX  XXX YYY   YY  GG GG     EE         NN      **
**   OO    OO  XXXXXX   YYYYYY   GG   GG   EEEEEE     NN      **
**   OO    OO    XX      YYYY    GG    GG  EEEEEE     NN      **
**                                                            **
****************************************************************
 **************************************************************

将url改为hockey.html，竟然有小关卡！！！
it's in the air. look at the letters
赶紧开开脑洞，空气中是什么，氧气啊！
oxygen :)

python challenge 7

http://www.pythonchallenge.com/pc/def/oxygen.html

这个题需要解析图片，之前没有接触到因此是查资料完成。
1.需要安装PIL库
2.截取黑白部分
3.将RGBA格式的像素每个像素值按照L8位黑白像素的格式转成一个acsii码值
4.将最后获得的答案转换为ASCII码

def oxygen():
    from PIL import Image
    img = Image.open("oxygen.png")
    # left,top,right,bottom
    box = (0, 43, 608, 52)
    belt = img.crop(box)
    # get a sequence object containing pixel values
    pixels = belt.getdata()
    print('mode: %s' % img.mode)
    print('amount of pixel: %d' % len(pixels))
    # print(pixels[0])

    # convert mode RGBA to mode L
    l_belt = belt.convert('L')
    # get a sequence object containing pixel values
    l_pixels = l_belt.getdata()

    result = []
    for i in range(0, 608, 7):
        result.append(chr(l_pixels[i]))

    print ''.join(result)
    
>>> smart guy, you made it. the next level is [105, 110, 116, 101, 103, 114, 105, 116, 121]

最后将[105, 110, 116, 101, 103, 114, 105, 116, 121]转为ascii即为答案:integrity

python challenge 8

http://www.pythonchallenge.com/pc/def/integrity.html

点击中间的蜜蜂弹出要求输入用户名和密码
然后看下源码，找到了用户名密码

un: 'BZh91AY&SYA\xaf\x82\r\x00\x00\x01\x01\x80\x02\xc0\x02\x00 \x00!\x9ah3M\x07<]\xc9\x14\xe1BA\x06\xbe\x084'
pw: 'BZh91AY&SY\x94$|\x0e\x00\x00\x00\x81\x00\x03$ \x00!\x9ah3M\x13<]\xc9\x14\xe1BBP\x91\xf08'

看下加密字符串特征BZ开头，猜测是否为bz2压缩，尝试成功。

def integrity():
    import bz2
    un = 'BZh91AY&SYA\xaf\x82\r\x00\x00\x01\x01\x80\x02\xc0\x02\x00 \x00!\x9ah3M\x07<]\xc9\x14\xe1BA\x06\xbe\x084'
    pw = 'BZh91AY&SY\x94$|\x0e\x00\x00\x00\x81\x00\x03$ \x00!\x9ah3M\x13<]\xc9\x14\xe1BBP\x91\xf08'
    print ": ".join(["username", bz2.decompress(un)])
    print ": ".join(["password", bz2.decompress(pw)])
>>>
    username: huge
    password: file

python challenge 9

http://www.pythonchallenge.com/pc/return/good.html

查看下源码找到最下边注释部分，和8题页面源码部分是不是很相似，应该是绘制图片的RGB值。
替换过去试试看，鼠标移动，发现浮现了一头牛的图案。
下边通过代码将这个图片完整的绘制出来。

def good():
    from PIL import Image, ImageDraw
    im = Image.new('RGB', (500, 500))
    draw = ImageDraw.Draw(im)

    headers = {
        "Authorization": "Basic aHVnZTpmaWxl"
    }
    page_source = requests.get("http://www.pythonchallenge.com/pc/return/good.html", headers=headers).text
    result = re.findall(r"first:\s([\s\S]+)second:\s([\s\S]+)-->", page_source)
    if len(result) > 0 and len(result[0]) == 2:
        first, second = result[0][0], result[0][1]
        first_points = list(eval(first.replace("\n", "")))
        second_points = list(eval(second.replace("\n", "")))
        for i in range(0, len(first_points), 2):
            draw.line(first_points[i:i + 4], fill='white')
        for i in range(0, len(second_points), 2):
            draw.line(second_points[i:i + 4], fill='white')
        im.save('img/09.jpg')

最后打印出来：
9-bull
看到牛先想到单词cow，然后进去试试看，发现有新的提示：
hmm. it's a male.
那就是公牛喽，换成bull,成功。

python challenge 10

http://www.pythonchallenge.com/pc/return/bull.html

这个题的图片赫然就是我们绘制的那头牛哇，感觉萌萌哒。
看下边len(a[30])=? 要知道这个答案，肯定要得到a这个序列。
查看下页面源码，点开里边的sequence.txt内容为(点击牛也会跳出)：

a = [1, 11, 21, 1211, 111221,

显然，要得到的答案就是这里了，下边找规律吧。

index	look	say
0	.	1
1	第0项有 1个1	11
2	第1项有 2个1	21
3	第2项有 1个2 1个1	1211
4	第3项有 1个1 1个2 2个1	111221
5	.	.

规律找到，编写代码如下：

def bull():

    def get_next_item(str_item=""):
        item = []
        if len(str_item) == 0:
            item.append("1")
        elif len(str_item) > 0:
            cur_list = list(str_item)
            same_count = 1
            if len(cur_list) == 1:
                item.append("".join([str(same_count), cur_list[0]]))
            elif len(cur_list) > 1:
                for i in range(len(cur_list)):
                    if i + 1 >= len(cur_list):
                        item.append("".join([str(same_count), cur_list[i]]))
                    elif i + 1 < len(cur_list):
                        if cur_list[i] == cur_list[i+1]:
                            same_count += 1
                        else:
                            item.append("".join([str(same_count), cur_list[i]]))
                            same_count = 1
        return "".join(item)

    index = 0
    cur_item = ""
    while index < 31:
        cur_item = get_next_item(cur_item)
        index += 1
    print "len(a[%d]) = %d" % (index-1, len(cur_item))
    
>>> len(a[30]) = 5808

网上看下别人的答案，找到一种更简单的实现方式，调用groupby函数，一行代码即可实现。

def bull_ex():
    from itertools import groupby
    a = '1'
    for i in range(30):
        a = ''.join(str(len(list(v))) + k for k, v in groupby(a))
    print len(a)

idhyt

2015-08-07

python challenge 0-10