Docs
Changelog
Front-end v0.0.5 – May 17, 2024
- Fixed
Added pages to the site:
- Page/Contacts
- Page/About
Crawler-child v0.0.4 – May 16, 2024
- Fixed
I came across a page of 28 megabytes, since the server was a child on the cheapest hosting, the provider killed the page processing script. The script did not complete the job correctly and added garbage to the tables, which caused the scan to continue.
Due to this bug, page scanning was stopped for 3 days. Bug fixed, scanning resumed.
Front-end v0.0.4 – May 10, 2024
- Updated
- Front downloads a file from json parent's server in which the goods are found.
- I’m not a designer or layout designer, so I found a ready-made template on the Internet for the site. I'll use it for now.
Front-end v0.0.3 – May 7, 2024
- Updated
Added sections to the site:
- Doc
- Help
Added pages to the site:
- Doc/Changelog - this page ;)
- Help/Submit a request
- Help/bot
Crawler-parent v0.0.3 – May 5, 2024
- Updated
- Now robots and pages are downloaded from the child’s server in two threads.
- Changed indexes from btree to hash. The size of the indexes was halved and the speed of query processing did not drop, which I consider a wonderful result!
Crawler-child v0.0.3 – May 2, 2024
- Updated
- Now I do all the development on the local computer, the child server is synchronized via Git.
- Updated scripts are restarted only after they have finished working, and not immediately, as before, losing part of the work done.
- I can also correctly pause any of the scripts individually, as well as the entire child server.
Now I can completely control the child’s server through changing records in the database, without going to the server itself! Later I will need to develop a server control panel child. Speeding up ~800,000 processing pages per day one child node!
v0.0.2 – April 26, 2024
- UpdatedI made storage and processing of robots files on the parent server. Before this, the child server was doing this.
v0.0.1 – April 18, 2024
- UpdatedThe first version is ready! The previous bot was divided into two. Now the child server downloads, processes and sends pages to the parent server. The parent server distributes tasks to the child server. Processing of 200,000 pages per day has been accelerated.
April 14, 2024
- NewTo scan hundreds of millions of pages per day, I need hundreds and maybe thousands of bots, controlled by a parent. Started programming Crawler parent.
v0.0.2 – March 27, 2024
- UpdatedStarted developing the database structure for Front-end.
March 1, 2024
- UpdatedIs on my mind to resume work on project.
October 3, 2023
- FrozenI stopped working on the project because I realized the global nature of the project. It takes a lot of time and money to implement it.
v0.0.1 – October 3, 2023
- NewStarted programming Front-end. Made one page "Coming soon..."
v0.0.2 – September 27, 2023
- FixedFixed a bug with deleting files, page processing speed increased to 80,000 pages.
v0.0.1 – September 24, 2023
- UpdatedThe bot began scanning website pages all over the Internet. In 24 hours the bot processed 46,628 pages!
September 13, 2023
- NewRegistered the domain redeken.com
September 11, 2023
- NewStarted programming Crawler child
September 5, 2023
- In my mind