research api next up is starting to write it
This commit is contained in:
@@ -0,0 +1,72 @@
|
||||
# ani-cli Codebase Analysis: How It Retrieves Anime Episodes
|
||||
|
||||
Based on the exploration of the `ani-cli` repository, here is how the application works behind the scenes to fetch anime episodes. The application utilizes a `bash` script which operates primarily by scraping and making API requests to `allanime` endpoints.
|
||||
|
||||
### 1. The Target API
|
||||
`ani-cli` points to a GraphQL API backend:
|
||||
* **Base URL**: `https://api.allanime.day/api` (constructed from `allanime_base="allanime.day"`)
|
||||
* **Referrer Policy**: To bypass basic bot protections, it explicitly sets the HTTP Referer header to `https://allmanga.to` (`$allanime_refr`) and passes a user-agent.
|
||||
|
||||
### 2. Searching & Episode Lists via GraphQL
|
||||
The codebase uses specific embedded GraphQL queries encoded within the bash script, sent via `curl -X POST`.
|
||||
* **Search**: It queries `shows(search: ...)` using a query named `search_gql` to find titles and returns their respective `_id` and episode count.
|
||||
* **Episode Listing**: Once an `_id` is found, it queries `episodes_list_gql` to retrieve a list of available episodes (e.g., `availableEpisodesDetail`) for the chosen sub/dub setting (`translationType`).
|
||||
|
||||
### 3. Fetching the Episode Video Links
|
||||
When an episode is selected, `ani-cli` needs the embedded player source. It does this by making another GraphQL request using `episode_embed_gql`.
|
||||
* It passes the `$showId`, `$translationType` (sub or dub mode), and `$episodeString` (the episode number).
|
||||
* The API returns a JSON payload containing `sourceUrls`.
|
||||
|
||||
### 4. Bypassing Encryption (`tobeparsed`)
|
||||
Sometimes, `allanime` obfuscates the video source URLs to prevent scraping. The API returns an encrypted base64 payload under the key `"tobeparsed"`.
|
||||
* `ani-cli` catches this field with `grep -q '"tobeparsed"'`.
|
||||
* It then routes the blob to a decryption function `decode_tobeparsed()`.
|
||||
* **The Decryption Method**: It extracts the IV (first 12 bytes of the decoded base64 string) and uses `openssl` to run AES-256-CTR decryption against the rest of the payload.
|
||||
* **The Key**: The decryption key (`$allanime_key`) is dynamically generated by taking the SHA-256 hash of the hardcoded salt string: `Xot36i3lK3:v1`.
|
||||
|
||||
### 5. Link Generation & Processing
|
||||
Once the embed URLs are decrypted (or retrieved plain), they are mapped to respective video providers using `generate_link()`. Providers include `wixmp` (the default), `youtube`, `sharepoint`, and `hianime`.
|
||||
* The `get_links()` function takes the direct links, hits them, and uses `sed` to extract `.mp4` URLs or `.m3u8` playlist files based on the provider format.
|
||||
* Subtitle URLs are also isolated if available.
|
||||
|
||||
### 6. Streaming or Downloading
|
||||
Finally, these isolated stream links (along with the necessary referrer headers) are passed directly into standard media players like `mpv`, `vlc`, `android_vlc`, or downstream download managers like `aria2c`.
|
||||
|
||||
---
|
||||
|
||||
## Code Reference (Line Numbers)
|
||||
|
||||
Here are the exact line numbers in the `ani-cli` script where these specific mechanisms are implemented:
|
||||
|
||||
* **API Configuration & Keys**:
|
||||
* `allanime_refr="https://allmanga.to"`: **Line 405**
|
||||
* `allanime_base="allanime.day"`: **Line 406**
|
||||
* `allanime_api="https://api.${allanime_base}"`: **Line 407**
|
||||
* `allanime_key` (The hardcoded AES key hash): **Line 408**
|
||||
* **GraphQL Queries**:
|
||||
* `episode_embed_gql` (Fetching the video player URLs): **Line 227**
|
||||
* `search_gql` (Searching for anime titles): **Line 257**
|
||||
* `episodes_list_gql` (Getting available episodes): **Line 280**
|
||||
* **The Decryption Logic**:
|
||||
* The `decode_tobeparsed()` function where the AES-256-CTR decryption happens: **Lines 211 - 221**
|
||||
* The check that routes the response to the decryption function (`if printf "%s" "$api_resp" | grep -q '"tobeparsed"'; then`): **Line 230**
|
||||
|
||||
---
|
||||
|
||||
## Is this a partnership, or can you do it yourself?
|
||||
|
||||
**You can absolutely do this yourself.** This is **not** an official partnership.
|
||||
|
||||
What the developers of `ani-cli` have done is known as **Reverse Engineering** and **Web Scraping**. When you watch a video on a site like *allanime* in your normal web browser, your browser has to know how to talk to their servers to get the video files. Because all of this happens on the client-side (in your browser), the instructions are visible if you know where to look.
|
||||
|
||||
Here is how developers (and how you can) figure this out for almost any website:
|
||||
|
||||
1. **Network Tab Inspection**: If you open your browser's Developer Tools (F12) and go to the "Network" tab, you can see every request the website makes. If you search for an anime, you will see a `POST` request going to `https://api.allanime.day/api`.
|
||||
2. **Payload Analysis**: By clicking on that network request, you can see exactly what data was sent (the GraphQL query) and what the server responded with (the JSON payload).
|
||||
3. **Bypassing Basic Protections**: Websites try to stop automated scripts from doing this by checking headers. The developers saw that the site checks the `Referer` header to make sure the request is coming from `https://allmanga.to`. So, they simply programmed `ani-cli` to fake that header (`curl -e "https://allmanga.to"`).
|
||||
4. **Finding Encryption Keys**: When the site started returning encrypted `"tobeparsed"` blobs instead of plain video URLs, the developers of `ani-cli` likely opened the "Sources" tab in their browser's Developer Tools, downloaded the website's obfuscated JavaScript files, and reverse-engineered how the web player decrypts the video. That's how they found the exact AES algorithm (`aes-256-ctr`) and the hardcoded salt string (`Xot36i3lK3:v1`).
|
||||
|
||||
**Can you do this?**
|
||||
Yes! You can use tools like Python (with `requests` and `BeautifulSoup`), Bash (like this script uses `curl`, `grep`, and `sed`), or NodeJS to replicate these exact network requests for any site.
|
||||
|
||||
*Note: Because this is reverse-engineered, sites frequently change their API endpoints, encryption keys, or security measures to break scrapers, which is why tools like `ani-cli` require constant updates.*
|
||||
Reference in New Issue
Block a user