Meet ted! Your new way of downloading tv shows from the web!
Add your favourite tv shows to ted and ted will automatically download torrents of new episodes!
Add your favourite tv shows to ted and ted will automatically download torrents of new episodes!
FS#213 - Rewrite parser
|
Details
Currently the parser is such a mess that it's hard to add new functionality. Please investigate a proper way to set up a new parser and implement it!
|
Also: the parser could be way more object oriented. We should create a torrent class that stores and retrieves information (like seeders, size, publish date andsoforth) from itself when the parser needs it.
Josh: have you started implemented anything on this subject?
I will open a seperate bug for that.
or even looking at one of the torrent clients like vuze and using/basing a version off what they have
1. defined new Feed interface, contains a list of FeedResult (contains all the result info, seeders, leechers, torrent url)
2. defined new Feed getter interface, implemented a Daily feed getter and Series feed getter using RSS to create a Feed
3. defined new Parser interface, interface has a list of "validators" (for validating results) and "listeners" (for communicating events/changes), 1 "selector" (for selecting the best torrent) and 1 "feed getter" (for getting the feed)
4. defined implementations of "selectors" based off the current parser (MOSTSEEDERS, BESTRATIO, BESTDAILY)
5. defined implementations of "validators", (minimum seeders, file size, compressed files, single episode only)
6. NOT FINISHED: update TedSerie to be a listener, so that it can be updated when the parser changes status.
7. NOT FINSHED: update TedSerie to contain a "parse log" which contains all the info about the parse, (i.e. "torrent BLAH rejected, not enough seeders, <LINK TI TORRENT>") this will allow us to set up downloading from the log, or ignoring results in the next parse.
8. created a Parser factory, which takes a TedSerie and builds a parser, populating the "validators" for the series type (daily or series), adding the "selector" based on series type/config, adding the feed getter based on the series type.
This means that the parser is all components, so we can add and remove or change them easily. also means we are not stuck with RSS for the feed source (i know isohunt has a json api).
NOTE: my parser loops through the "validators" which loop through the feed results, the current parser loops through the results which loop through the validations. doing it my way is faster/technically better (i.e. less startup time/creation time) but this means that the current progress bar set up will not work. I could change the "validators" to work the same way as the current setup if my way is unacceptable/ you prefer it that way but I was thinking we could have the progress bar say "initalizing", "reading", "parsing", "selecting best", "finished"
example flow:
TedSerie ("the blah show")
Factory creates Parser with daily feed implementation, size validator, minumum seeders validator, best daily torrent selector.
Parser reads daily feed
Parser has 4 results (result 1, result 2, result 3, result 4)
size validator runs (removes result 1, too small)
minimum seeders validator runs (removes result 2, not enough seeders)
best daily torrent selector runs (rejects torrent 3 because it aired BEFORE the last downloaded episode, downloads result 4 because it is best)
Sounds good and resonates with the ideas I had in my head. I have a few suggestions/questions:
1 - Why the different feed types (daily/serie)? I don't see why the feeds would be different for a specific kind of show? I would like to get rid of as much "daily"/"season-episode" differences that we have now. Your validators are a good starting point for this. Why introduce two kinds of feeds?
2 - I miss the validator for validating the correct episode from a torrent tile.
2 - I have some ideas for future features like manually picking a torrent from a failed search when no torrents have been found. Not from the textual log but more from a list of torrents that were found and their reject reasons somewhere in a popup window. I would propose to introduce a torrent wrapper class that can remember all this and store the last search result somewhere (inside the TedSerie?). I will try to post a UI mockup of my ideas soon.
3 - That torrent wrapper class could also hold lots of methods that are now performed in the parser like: getting the seeders, translating the info url to a torrent url, compute the size of its contents, etc. And be a place to remember all this. It could also implement a way of retrieving the season/episode number or daily date from the torrent title so that the validator for the correct episode can use that.
4 - The progress bar and status of the serie are not really compatible with your way of parsing and I don't know if that really matters. But the adventage of the status bar right now is that you can see on which torrent ted is stuck..
2a. I only listed a couple of validators, so far i have written about 11 validators they are: DailyEpisode, SeriesEpisode, DailyLatest, HDEpisode, MinimumSeeders, PublicTracker, BestRatio, MostSeeders, SingleEpisode, Size and UncompressedFile. Still need to write one for double episodes. Any others you can think of?
2b. It sounds like your idea is alot like my idea of having a list of rejected torrents inside the TedSerie, i'd like to see what your ui mock up would look like
3. the size, seeders, and all the translating will be handled by the FeedResult or Feed, this allows us to extend the feeds to possibly include different file types down the track, i.e. NBZ files
4. If the parser works it shouldn't get stuck if an error happens with a result, that result should rejected. i will play around and maybe have the progress bar loop through with the validators as well.
2a) seems like you got them all. UncompressedFile might not be the best name, I would call it 'BlockedExtensions' or something
2b) See the attached file. My idea is that this dialog could be opened from the main dialog when a search has failed (or maybe even when it succeeded), replacing the call to check the log. Please give some feedback, I havent discussed this with anyone yet.
3) Ok so feedresult actually represents a Torrent? What happens if we would add support for NBZ?
2b.) you idea is much more detailed than mine, but easily done i will work on including that information in the feed log.
3.) at the moment the feedresult represents a result, not specifically a torrent. it could be extended to include anything, i havent looked at adding NBZ but result is generic enough that we could do it.
In the end the look and feel of the screen should be independent of the implementation of the parser.
There is only one issue with your mock up, once a result is rejected it isnt validated any further. if i have a torrent with 10 seeders(below minimum) and the file is 100mb(below minimum) and the parser is set up to validate seeders before file size then the log will only have something like "rejected: not enough seeders".
we could validate everything on every torrent but that would be a little slower.
we could 1.) remove obviously wrong episodes (wrong season, wrong date) then 2.) process ALL validations for ALL remaining results noting ALL rejection reasons.
eg: 3 episodes
RESULT 1: the blah show Season 1.torrent, 10 seeders, 20 leechers, 1000mb
RESULT 2: the blah show S02E02.torrent, 10 seeders, 20 leechers, 300mb
RESULT 3: the blah show 02x02.torrent, 50 seeders, 100 leechers, 350mb
loop through, remove RESULT 1, its a season, not an episode, validate RESULT 2, reject it because it doesnt have enough seeders and the file is too small, RESULT 3 passes all validations