Files
anishio/docs/analyzed.md
T

72 lines
5.9 KiB
Markdown

# ani-cli Codebase Analysis: How It Retrieves Anime Episodes
Based on the exploration of the `ani-cli` repository, here is how the application works behind the scenes to fetch anime episodes. The application utilizes a `bash` script which operates primarily by scraping and making API requests to `allanime` endpoints.
### 1. The Target API
`ani-cli` points to a GraphQL API backend:
* **Base URL**: `https://api.allanime.day/api` (constructed from `allanime_base="allanime.day"`)
* **Referrer Policy**: To bypass basic bot protections, it explicitly sets the HTTP Referer header to `https://allmanga.to` (`$allanime_refr`) and passes a user-agent.
### 2. Searching & Episode Lists via GraphQL
The codebase uses specific embedded GraphQL queries encoded within the bash script, sent via `curl -X POST`.
* **Search**: It queries `shows(search: ...)` using a query named `search_gql` to find titles and returns their respective `_id` and episode count.
* **Episode Listing**: Once an `_id` is found, it queries `episodes_list_gql` to retrieve a list of available episodes (e.g., `availableEpisodesDetail`) for the chosen sub/dub setting (`translationType`).
### 3. Fetching the Episode Video Links
When an episode is selected, `ani-cli` needs the embedded player source. It does this by making another GraphQL request using `episode_embed_gql`.
* It passes the `$showId`, `$translationType` (sub or dub mode), and `$episodeString` (the episode number).
* The API returns a JSON payload containing `sourceUrls`.
### 4. Bypassing Encryption (`tobeparsed`)
Sometimes, `allanime` obfuscates the video source URLs to prevent scraping. The API returns an encrypted base64 payload under the key `"tobeparsed"`.
* `ani-cli` catches this field with `grep -q '"tobeparsed"'`.
* It then routes the blob to a decryption function `decode_tobeparsed()`.
* **The Decryption Method**: It extracts the IV (first 12 bytes of the decoded base64 string) and uses `openssl` to run AES-256-CTR decryption against the rest of the payload.
* **The Key**: The decryption key (`$allanime_key`) is dynamically generated by taking the SHA-256 hash of the hardcoded salt string: `Xot36i3lK3:v1`.
### 5. Link Generation & Processing
Once the embed URLs are decrypted (or retrieved plain), they are mapped to respective video providers using `generate_link()`. Providers include `wixmp` (the default), `youtube`, `sharepoint`, and `hianime`.
* The `get_links()` function takes the direct links, hits them, and uses `sed` to extract `.mp4` URLs or `.m3u8` playlist files based on the provider format.
* Subtitle URLs are also isolated if available.
### 6. Streaming or Downloading
Finally, these isolated stream links (along with the necessary referrer headers) are passed directly into standard media players like `mpv`, `vlc`, `android_vlc`, or downstream download managers like `aria2c`.
---
## Code Reference (Line Numbers)
Here are the exact line numbers in the `ani-cli` script where these specific mechanisms are implemented:
* **API Configuration & Keys**:
* `allanime_refr="https://allmanga.to"`: **Line 405**
* `allanime_base="allanime.day"`: **Line 406**
* `allanime_api="https://api.${allanime_base}"`: **Line 407**
* `allanime_key` (The hardcoded AES key hash): **Line 408**
* **GraphQL Queries**:
* `episode_embed_gql` (Fetching the video player URLs): **Line 227**
* `search_gql` (Searching for anime titles): **Line 257**
* `episodes_list_gql` (Getting available episodes): **Line 280**
* **The Decryption Logic**:
* The `decode_tobeparsed()` function where the AES-256-CTR decryption happens: **Lines 211 - 221**
* The check that routes the response to the decryption function (`if printf "%s" "$api_resp" | grep -q '"tobeparsed"'; then`): **Line 230**
---
## Is this a partnership, or can you do it yourself?
**You can absolutely do this yourself.** This is **not** an official partnership.
What the developers of `ani-cli` have done is known as **Reverse Engineering** and **Web Scraping**. When you watch a video on a site like *allanime* in your normal web browser, your browser has to know how to talk to their servers to get the video files. Because all of this happens on the client-side (in your browser), the instructions are visible if you know where to look.
Here is how developers (and how you can) figure this out for almost any website:
1. **Network Tab Inspection**: If you open your browser's Developer Tools (F12) and go to the "Network" tab, you can see every request the website makes. If you search for an anime, you will see a `POST` request going to `https://api.allanime.day/api`.
2. **Payload Analysis**: By clicking on that network request, you can see exactly what data was sent (the GraphQL query) and what the server responded with (the JSON payload).
3. **Bypassing Basic Protections**: Websites try to stop automated scripts from doing this by checking headers. The developers saw that the site checks the `Referer` header to make sure the request is coming from `https://allmanga.to`. So, they simply programmed `ani-cli` to fake that header (`curl -e "https://allmanga.to"`).
4. **Finding Encryption Keys**: When the site started returning encrypted `"tobeparsed"` blobs instead of plain video URLs, the developers of `ani-cli` likely opened the "Sources" tab in their browser's Developer Tools, downloaded the website's obfuscated JavaScript files, and reverse-engineered how the web player decrypts the video. That's how they found the exact AES algorithm (`aes-256-ctr`) and the hardcoded salt string (`Xot36i3lK3:v1`).
**Can you do this?**
Yes! You can use tools like Python (with `requests` and `BeautifulSoup`), Bash (like this script uses `curl`, `grep`, and `sed`), or NodeJS to replicate these exact network requests for any site.
*Note: Because this is reverse-engineered, sites frequently change their API endpoints, encryption keys, or security measures to break scrapers, which is why tools like `ani-cli` require constant updates.*