diff --git a/README.md b/README.md index 82a397f..213b3a8 100644 --- a/README.md +++ b/README.md @@ -48,6 +48,58 @@ this relatively quick but YMMV. 1. Download any song(s) you want with `c3dbdl [options] download [options]`. +## Database & Included Data + +The database is contained in a JSON document which lists all possible songs which were scraped from the C3DB +pages during the `database build` step. + +To obtain the database, first the specified base URL is downloaded to get a list of pages, and then each page +is iterated through. Within each page, all "song" table entries are extracted for information, and the song +page itself visited to obtain a full list of download links. The song iteration is performed in parallel with +a default of 10 simultaneous jobs (configurable with `-c`/`--concurrency`) to speed up downloading. + +Once all pages and songs have been scanned, the results are saved into the database file specified, which can +then be reused for future downloads. Note that cancelling a `database build` before it is finished will result +in an empty database and the process will have to be started again from the beginning. + +A database file cannot be updated; it must be replaced wholesale. + +The contents of the database include all information required for filtering and downloading as described below. +An example entry (first entry on the first page) is: + +``` +{ + "artist": "Heatwave", + "title": "Boogie Nights", + "album": "Too Hot to Handle", + "song_link": "https://db.c3universe.com/song/-34018", + "genre": "Pop-Rock", + "year": "1976", + "length": "0:05", + "author": "D97", + "dl_links": [ + { + "link": "https://dl.c3universe.com/642d6ab2aa5b87.10964554", + "description": "Rock Band 3 Xbox 360" + } + ] +} +``` + +### Download Links + +The `c3dbdl` tool is very picky about the download links (`dl_links`) it selects. Specifically, it will *only* +include links from `c3universe.com`, and not any other external "download sites" such as Mega.nz, Angelfire, +etc. + +This is done because the non-iteractive, command-based download method is not compatible with those sites, and +we want this tool to be as automated as possible. Requiring some manual clickthrough of a web page would defeat +the purpose here, and thus, we simply exclude them and require you download any such songs manually. + +If a song ends up with no `dl_links` during scanning, for instance because they all pointed to such external +"download sites", it will not be included in the database. Thus, the final number of songs in your database is +guaranteed to be smaller than the total number listed on the C3DB website. + ## Filtering Filtering out the songs in the database is a key part of this tool. You might want to be able to grab only select @@ -79,17 +131,15 @@ from the album Vapor Trails (the remixed version) authored by ejthedj: ``` c3dbdl download --filter artist Rush --filter album "Vapor Trails [Remixed]" --author ejthedj -``` - -This shouldfind , as of 2023-04-02, exactly one song, "Sweet Miracle": - -``` -Found 28942 songs from JSON database file 'Downloads/c3db.json' +Found 19563 songs from JSON database file 'Downloads/c3db.json' Downloading 1 song files... -Downloading song "Rush - Sweet Miracle" by ejthedj... -Downloading from https://dl.c3universe.com/s/ejthedj/sweetMiracle... +> Downloading song "Rush - Sweet Miracle" by ejthedj... +Downloading file "Rock Band 3 Xbox 360" from https://dl.c3universe.com/s/ejthedj/sweetMiracle... +Successfully downloaded to ../Prog/ejthedj/Rush/Vapor Trails [Remixed]/Sweet Miracle [2002].sweetMiracle ``` +In this case, one song matched and was downloaded. + In addition to the above filters, within each song may be more than one download link. To filter these links, use the "-i"/"--download-id" and "-d"/"--download-descr" (see the help for details). @@ -112,20 +162,26 @@ which are mapped at download file. The available fields are: * `title`: The title of the song. * `year`: The year of the album/song. * `author`: The author of the file on C3DB. -* `orig_file`: The original filename that would be downloaded by e.g. a browser. +* `orig_name`: The original filename that would be downloaded by e.g. a browser. -The default structure leverages all of these options to create an archive-ready structure as follows: +The default structure leverages most of these options to create an archive-ready structure as follows: ``` -{genre}/{author}/{artist}/{album}/{title} [{year}].{orig_file} +{artist}/{album}/{title}.{author}.{orig_name} ``` -As an example: +As an example, as shown in the previous section: ``` -Prog/Rush/Vapor Trails [Remixed]/Sweet Miracle [2002] (ejthedj).sweetMiracle +Rush/Vapor Trails [Remixed]/Sweet Miracle.ejthedj.sweetMiracle ``` +The genre is excluded because in my experience it is a fairly useless metric and is often incorrectly set, +so it gets in the way more often than not. You are free of course to add it in to your own custom structure. +The date is excluded for similar reasons and because if you know the album, you know the date. + +If any field is missing during download, it is replaced with "None". + Note that any parent director(ies) will be automatically created down the whole tree until the final filename. ## Help diff --git a/c3dbdl/c3dbdl.py b/c3dbdl/c3dbdl.py index 3a56651..9347cb0 100755 --- a/c3dbdl/c3dbdl.py +++ b/c3dbdl/c3dbdl.py @@ -436,7 +436,7 @@ def database(): "--file-structure", "_file_structure", envvar="C3DBDL_DL_FILE_STRUCTURE", - default="{genre}/{author}/{artist}/{album}/{title} [{year}].{orig_name}", + default="{artist}/{album}/{title}.{author}.{orig_name}", help="Specify the output file/directory stucture.", ) @click.option( @@ -490,7 +490,7 @@ def download(_filters, _id, _desc, _limit, _file_structure): \b The default output file structure is: - "{genre}/{author}/{artist}/{album}/{title} [{year}].{orig_name}" + "{artist}/{album}/{title}.{author}.{orig_name}" Filters allow granular selection of the song(s) to download. Multiple filters can be specified, and a song is selected only if ALL filters match (logical AND). Each filter