The site is getting absolutely hammered by bots. Arduboy has been growing in popularity in Russia. Coincidence? Maybe. I think someone is scraping Arduboy forum for AI training.
Like, get bent ya hoser. This is costing me thousands in annual hosting fees.
The forum has features to mitigate this, but they are woefully misguided in their design. Ivory tower, no dog food eating nonsense. They have a white list, but that will block any user agent not on it. This means search engines in smaller countries that I donāt know about would not index the site. Maybe that is the arbitrage here.
There is also a rate limiting feature, which would work perfect in this situation but it is only applied to user agents that are specifically targeted for rate limiting. You cannot apply rate limiting to ānewā user agents that you have not yet captured.
The easy fix here would be just have an option to rate limit user agents that are not on a white list.
Even better, and I am scratching my head and starting to get a little pissed as to why this isnāt a feature:
Why is there no feature to block or rate limit a user agent after it hits a certain number of views in a certain time??? Like: Wow Iāve never seen this user agent before and itās generated 100x as much traffic as a normal user, maybe Iāll put them in timeout.
This really is a weak link in discourses forums right now. I doubt Iām the only one dealing with this problem.
Does anyone have any suggestions here? At this point I think I will just have to create a very inclusive white list and just deal with the sad reality Iām forcing people to use the major search engines to find out about Arduboy.
Maybe thatās not a big deal? Am I making a bigger deal out of it than it needs to be?
I mean, Discourse is free, open source and can be self hosted. So I keep running into situations where it would be in my best interest to host Discourse myself. It sort of seems like their support system is designed to push customers in this direction instead of actually addressing their needs. I.e. Discourse hosting seems only interested in taking money from low hanging fruit.
/rant