A system therapist, dataset wizard, Email troubleshooter, website ninja.
And just maybe, a programmer.

"An apple device a day keeps doctoral degree away."
News
Last update: Feb 6, 2025
New Career! I have started my career in Meta as a Research Scientist, I work in internationalization (i18n) team, helping to deliver Meta products globally with a local feeling.
New Career! I have started my career in Bloomberg as a Software Engineer. My work will primarily support internal ticketing pipeline to ensure timely and accurate ticket generation and delivery.
Paper accepted in NDSS 2024! [Like, Comment, Get Scammed: Characterizing Comment Scams on Media Platforms](https://like-comment-get-scammed.github.io/) @inproceedings{li2024commentscams, title = {Like, Comment, Get Scammed: Characterizing Comment Scams on Media Platforms}, author = {Xigao Li and Amir Rahmati and Nick Nikiforakis}, booktitle = {Proceedings of the Network and Distributed System Security Symposium (NDSS)}, year = {2024}, }
I have successfully defended my Ph.D thesis, “Measuring the Role of Automation in Malicious Web Activities” on August 4th, 2023.
Paper accepted in WWW 2023! [Scan Me If You Can: Understanding and Detecting Unwanted Vulnerability Scanning](https://scan-me-if-you-can.github.io) @inproceedings{li2023scanme, author = {Li, Xigao and Amin Azad, Babak and Rahmati, Amir and Nikiforakis, Nick}, title = {Scan Me If You Can: Understanding and Detecting Unwanted Vulnerability Scanning}, year = {2023}, booktitle = {Proceedings of the ACM Web Conference (WWW)}, }
Paper accepted in NDSS 2023! [Double and Nothing: Understanding and Detecting Cryptocurrency Giveaway Scams](https://double-and-nothing.github.io/) @inproceedings{li2023cryptoscams, title = {Double and Nothing: Understanding and Detecting Cryptocurrency Giveaway Scams}, author = {Xigao Li and Anurag Yepuri and Nick Nikiforakis}, booktitle = {Proceedings of the Network and Distributed System Security Symposium (NDSS)}, year = {2023}, }
Paper Accepted at Oakland 2021! [Good Bot, Bad Bot: Characterizing Automated Browsing Activity](https://you.stonybrook.edu/xigaoli/files/2021/04/goodbotbadbot_oakland2021.pdf) @inproceedings{li2021botcharacterization, title = {Good Bot, Bad Bot: Characterizing Automated Browsing Activity}, author = {Xigao Li and Babak {Amin Azad} and Amir Rahmati and Nick Nikiforakis}, booktitle = {Proceedings of the 42nd IEEE Symposium on Security and Privacy (IEEE S\&P)}, year = {2021}, }
About Me
I am a Ph.D graduated in the Department of Computer Science at Stony Brook University.
During my Ph.D, I am co-advised by Professor Nick Nikiforakis and Professor Amir Rahmati. My research focuses on web security and machine learning. On one side, I develop systems to measure and classify automated Internet bots through both machine-learning and heuristic approaches, capture malicious bot behaviors by developing “fingerprinting” techniques. On the other side, I aim to build lightweight and pragmatic deep learning models and use information retrieval techniques to get security insights.
Prior to Stony Brook, I worked on file system security and optimization. My work of disaster-tolerance of MooseFS can be found in here(github), as well as some published paper.
Published Paper
- [WWW 2025] [The Poorest Man in Babylon: A Longitudinal Study of Cryptocurrency Investment Scams] @inproceedings{muzammil2025crimson, title = {The Poorest Man in Babylon: A Longitudinal Study of Cryptocurrency Investment Scams}, author = {Muhammad Muzammil and Abisheka Pitumpe and Xigao Li and Amir Rahmati and Nick Nikiforakis}, booktitle = {Proceedings of the Web Conference (WWW)}, year = {2025}, }
- [NDSS 2024] Like, Comment, Get Scammed: Characterizing Comment Scams on Media Platforms @inproceedings{li2024commentscams, title = {Like, Comment, Get Scammed: Characterizing Comment Scams on Media Platforms}, author = {Xigao Li and Amir Rahmati and Nick Nikiforakis}, booktitle = {Proceedings of the Network and Distributed System Security Symposium (NDSS)}, year = {2024}, }
- [Thesis] Measuring the Role of Automation in Malicious Web Activities
- [WWW 2023] Scan Me If You Can: Understanding and Detecting Unwanted Vulnerability Scanning @inproceedings{li2023scanme, author = {Li, Xigao and Amin Azad, Babak and Rahmati, Amir and Nikiforakis, Nick}, title = {Scan Me If You Can: Understanding and Detecting Unwanted Vulnerability Scanning}, year = {2023}, booktitle = {Proceedings of the ACM Web Conference (WWW)}, }
- [NDSS 2023] Double and Nothing: Understanding and Detecting Cryptocurrency Giveaway Scams @inproceedings{li2023cryptoscams, title = {Double and Nothing: Understanding and Detecting Cryptocurrency Giveaway Scams}, author = {Xigao Li and Anurag Yepuri and Nick Nikiforakis}, booktitle = {Proceedings of the Network and Distributed System Security Symposium (NDSS)}, year = {2023}, }
- [Oakland 2021] Good Bot, Bad Bot: Characterizing Automated Browsing Activity @inproceedings{li2021botcharacterization, title = {Good Bot, Bad Bot: Characterizing Automated Browsing Activity}, author = {Xigao Li and Babak {Amin Azad} and Amir Rahmati and Nick Nikiforakis}, booktitle = {Proceedings of the 42nd IEEE Symposium on Security and Privacy (IEEE S\&P)}, year = {2021}, }
Research Projects
-
Understanding and Detecting Cryptocurrency Giveaway Scams (Paper Accepted at NDSS 2023!) Paper website
- First large-scale analysis over cryptocurrency giveaway scams.
- Created automated cryptocurrency scam tracking systems to capture cryptocurrency scam webpages.
- Collected 10,079 scam web pages in 6 months, extracted 2,266 cryptocurrency scam wallets.
- Inititated first quantitative analysis to cryptocurrency scam fund loss – attackers have stolen the equivalent of tens of millions of dollars ($26M – $70M).
- Understanding and Detecting Unwanted Vulnerability Scanning (Paper Accepted at WWW 2023!)
- Designed a testbed for web vulnerability scanners (WVS)
- Observed differences between WVS and users though user study
- Designed ScannerScope - supervised machine learning model classifies user vs. WVS
-
Measuring Web Bot Ecosystem (Paper Accepted at Oakland 2021!) Paper PDF
- Created automatic systems that can deploy honeypot-like web servers to capture web bot activities.
- Developed behavioral fingerprinting techniques to detect bot behavior and intention.
- Analyzed bot behaviors, discover malicious bot intentions of bruteforcing, probing and exploiting vulnerabilities.
- Created visualization of captured bot dataset, provide security insights.
-
Malware Classification with Deep Neural Network using Lightweight Emulation Talk PDF
- Developed automated malware emulation pipeline, emulated 11 Million malwares with cost <10 hours for EMBER’17 dataset
- Extracted malware API call sequence, memory access information and RWX counter
- Trained lightGBM and character level CNN model, achieved 0.99 AUROC / 0.98 accuracy
- Developed a hybrid CNN model classifying malware families, reached 0.96 accuracy
-
Malicious URL detection for mobile browsers through Deep Neural Network
- Crawled both malicious and benign URLs from multiple sources
- Trained a classifier through CNN and RNN(LSTM).
- Make the model mobile-available, built a browser demo intergrated with ML model.
Other Projects
Other than major research threads, I build mini-projects for testing new techniques and for fun.
-
Animal breed classification with deep neural network github
- Trained a modified VGG16 model to classify cat/dog images and their specific breeds
- Fine-tuned hyperparameters to achieve best accuracy.
- Developed web app interface to classify animal breed from URL.
-
Empirical study with time series data from Anime market github
- Crawled anime ranking data from 2006 to 2021, extracted anime ranking and scoring data through websites, built a clean ranking dataset
- Analyzed anime ranking trend, visualized with dynamic video [youtube video]
- Analyzed popular anime picture tags through Safebooru, extracted popular tags from 2011 to 2021
- Designed a decay algorithm to measure the popularity of tags over time.
- Built and fine-tuned a multi-label classifier for anime figures based on a modified VGG-19 model; the model can predict possible tags from any anime figures.
-
Anime face dataset and generation through generative adversarial network
- Used face alignment technique to extract faces from ~30,000 anime portraits and ~2,500 cosplay human faces, build a anime-face oriented dataset.
- Generated anime faces through styleGAN2, with aligned 15,000 anime faces through face detection.
- safebooru tag trend from 2010-2020: https://you.stonybrook.edu/xigaoli/safebooru-anime-tag-trend-analysis/
- Another mock personal homepage, but built through Wangler workers: https://my-worker.lxgfrom2009.workers.dev/
- Simple but pragmatic tool blocking SogouInput ads and tracking: https://github.com/xigaoli/sgcld
Lastly, I have something else that is interesting.