简体中文 / [English]


Time Machine - QQ Space Crawler

 

This article is currently an experimental machine translation and may contain errors. If anything is unclear, please refer to the original Chinese version. I am continuously working to improve the translation.

A lot of people tend to set their QQ Moments to be visible only for the past three days or six months, so I built this crawler to scrape all their posts…

Maybe in the future I could add a feature to crawl the QQ spaces of a user’s friends’ friends (inception-style), which might come in handy for certain social engineering purposes.

The main work was creating a Docker wrapper for an open-source project on GitHub.
https://github.com/wwwpf/QzoneExporter

Features include…

1
2
3
4
5
6
Export and display QQ Space data

- Export logs, guestbook entries, albums, status updates (shuoshuo), and more.
- Download images and videos from status updates and albums to local storage.
- Display local data as a web page, with automatic download of images and videos during browsing.
- Support writing Exif metadata back to photos, and embed timestamps into filenames.

I’ve also added an auto-login feature that bypasses Tencent’s sliding CAPTCHA.

The CAPTCHA bypass approach was inspired by: https://github.com/ybsdegit/captcha_qq

Additionally, I’m using Git for version control of the data, so you can actually track all historical posts over time.

The code isn’t open-sourced yet — it’s currently hosted on Gitee…

This article is licensed under the CC BY-NC-SA 4.0 license.

Author: lyc8503, Article link: https://blog.lyc8503.net/en/post/qzone-exporter/
If this article was helpful or interesting to you, consider buy me a coffee¬_¬
Feel free to comment in English below o/