Jump to content

User talk:Citation bot: Difference between revisions

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Content deleted Content added
Line 467: Line 467:


{{bot bug
{{bot bug
| status = {{fixed}} took a while to figure out, but it is working right now
| status = new bug
| reported by = [[User:BrownHairedGirl|<span style="font-variant:small-caps"><span style="color:#663200;">Brown</span>HairedGirl</span>]] <small>[[User talk:BrownHairedGirl|(talk)]] • ([[Special:Contributions/BrownHairedGirl|contribs]])</small> 16:29, 12 September 2021 (UTC)
| reported by = [[User:BrownHairedGirl|<span style="font-variant:small-caps"><span style="color:#663200;">Brown</span>HairedGirl</span>]] <small>[[User talk:BrownHairedGirl|(talk)]] • ([[Special:Contributions/BrownHairedGirl|contribs]])</small> 16:29, 12 September 2021 (UTC)
| what happens = bot makes two changes: adds a date to one ref, fills in another bare ref. But the edit summary is just {{tq|Add: date.|q=y}}, with no mention the bare URL being converted to CS1/CS2
| what happens = bot makes two changes: adds a date to one ref, fills in another bare ref. But the edit summary is just {{tq|Add: date.|q=y}}, with no mention the bare URL being converted to CS1/CS2

Revision as of 14:31, 13 September 2021


You may want to increment {{Archive basics}} to |counter= 28 as User talk:Citation bot/Archive 27 is larger than the recommended 150Kb.

Note that the bot's maintainer and assistants (Thing 1 and Thing 2), can go weeks without logging in to Wikipedia. The code is open source and interested parties are invited to assist with the operation and extension of the bot. Before reporting a bug, please note: Addition of DUPLICATE_xxx= to citation templates by this bot is a feature. When there are two identical parameters in a citation template, the bot renames one to DUPLICATE_xxx=. The bot is pointing out the problem with the template. The solution is to choose one of the two parameters and remove the other one, or to convert it to an appropriate parameter. A 503 error means that the bot is overloaded and you should try again later – wait at least an hour.

Or, for a faster response from the maintainers, submit a pull request with appropriate code fix on GitHub, if you can write the needed code.

Consistent spacing

Status
new bug
Reported by
Abductive (reasoning) 03:24, 2 August 2021 (UTC)[reply]
What happens
bot added a date parameter in a ref with a space before every pipe, but did not include a space
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=53W53&type=revision&diff=1036681818&oldid=1036681278
We can't proceed until
Feedback from maintainers


I know this is a minor bug, but it bugs me. I know that the bot is written to make an attempt to duplicate the formatting already present in the ref. How it could have failed here, I don't know. But more importantly, it should default to the consensus ref formatting: space,pipe,parametername,=,parametervalue. (Spaces before pipes, no spaces around the equals signs or anywhere else, except perhaps before the curly end brackets if there already was a space there.) Abductive (reasoning) 03:24, 2 August 2021 (UTC)[reply]

I agree. The default should be space,pipe,parametername,=,parametervalue. --BrownHairedGirl (talk) • (contribs) 15:27, 2 August 2021 (UTC)[reply]
Cannot fix since the the bot already uses the existing citation template as a guide. Templates that are mixes in spacing such as these cannot be done in a way that makes everyone happy. AManWithNoPlan (talk) 16:45, 2 August 2021 (UTC)[reply]
But how to explain the example? The bot deviated from the format of the ref it edited? Abductive (reasoning) 16:59, 2 August 2021 (UTC)[reply]
I see, you want the bot to add spaces to existing parameters - in particular the last one. Interesting, the bot by default does not in anyway modify spacing of existing parameters. That parameter has no trailing spaces. As far as the bot in concerned there are no spaces before pipes, just spaces at the end of parameters. AManWithNoPlan (talk) 17:14, 2 August 2021 (UTC)[reply]
The bot must have looked at the lack-of-space of the last parameter (before the end curly braces) to come to the conclusion that the ref was formatted that way. Perhaps it should look after the "cite xxxx" for the cue? Abductive (reasoning) 17:51, 2 August 2021 (UTC)[reply]
not, that is not what it did. It simply does not change the spacing of existing parameters. The existing final parameter has no ending space, so the bot does not add one. AManWithNoPlan (talk) 21:14, 2 August 2021 (UTC)[reply]
Ah, I see what you are saying. It slotted it in at the end. Well, I had hoped that the bot could have provided a cure to the annoying new habit of users removing all spaces from refs, making a wall of text for editors. Abductive (reasoning) 22:25, 2 August 2021 (UTC)[reply]
And creates annoyingly unpredictable line wraps. Does this format really have consensus? If so, bots (any bot) could create a cosmetic function for citations they edit. -- GreenC 17:04, 6 August 2021 (UTC)[reply]
There are some people who like the "crammed" format. I started a conversation about the formatting here, but I don't really understand what they were saying. Abductive (reasoning) 02:06, 7 August 2021 (UTC)[reply]
As Abductive suggests, what the bot should do ideally is to check if the first parameter's pipe following the template name is preceded by a space (or even better, if at least one of the parameters' pipe symbol is preceded by space) and if it is, it should add a space in front of pipe symbol of newly inserted parameters, no matter where they are inserted into the parameter list. If the template has no parameters yet, the bot should fall back to the "default" format "space, pipe, parameter name, equal sign, parameter value" we consistently use in all CS1/CS2 documentation and examples. (Well, IMO, this latter format would ideally be made the only format used at all, but that's a discussion beyond the scope of CB issues here.)
Yeah, it is only cosmetic, but like Abductive I too find it somewhat annoying when previously perfectly formatted citations become misaligned by bot edits.
--Matthiaspaul (talk) 13:34, 7 August 2021 (UTC)[reply]
While I agree, this is actually going to be hard to implement. I will need to think about it. AManWithNoPlan (talk) 18:12, 8 August 2021 (UTC)[reply]
Still thinking about how to do this. It will have to deal with figuring out what the last parameter before adding a parameter to the very end, but no the middle. AManWithNoPlan (talk) 00:51, 4 September 2021 (UTC)[reply]

Only four requests at a time?

Status
new bug
Reported by
Abductive (reasoning) 22:54, 2 August 2021 (UTC)[reply]
What happens
It seems that the bot can only work on four jobs at any one time.
We can't proceed until
Feedback from maintainers


I sampled the bot's edits going back a few days, and seems that the bot can only interleave, or even accept a single page, only four requests at any one time. At no point can I find five jobs interleaving, and (although this is harder to be certain about) at no point when there are four jobs interleaving can a fifth job be found, even a single page requested. Is this deliberate, and if yes, is it really a necessary constraint on the bot? Abductive (reasoning) 22:54, 2 August 2021 (UTC)[reply]
that is what i have observed and complained about also. I am convinced that default PHP config is 4. Someone with tool service access needs to get the bot a custom lighttd config file. AManWithNoPlan (talk) 23:03, 2 August 2021 (UTC)[reply]
Gah. Abductive (reasoning) 23:07, 2 August 2021 (UTC)[reply]
lol, you people with "jobs". the rest of us with single page requests can't get anything in no matter how many jobs.  — Chris Capoccia 💬 11:20, 3 August 2021 (UTC)[reply]
https://wikitech.wikimedia.org/wiki/Help:Toolforge/Web/Lighttpd Look at PHP and the "Default_configuration" area that starts collapsed. AManWithNoPlan (talk) 19:18, 3 August 2021 (UTC)[reply]

This is also part of the wider problem that the bot needs much more capacity, and also that a lot of its time is taken up speculative trawls through wide sets of articles which have not been identified as needing bot attention and which often produce little change. Huge categories are being fed to the bot which changes little over 10% of them, and most of those changes are trivia (type of quote mark in title) or have effect at all on output (removing redundant parameters or changing template type). It would help a lot if those speculative trawls were given a lower priority. --BrownHairedGirl (talk) • (contribs) 22:54, 9 August 2021 (UTC)[reply]

Who would decide what " speculative trawls " are? And what should the limit be? Might be hard to find something that can be agreed on. Perhaps the users that request these large categories see them as very important, while you don't. Of course it will be easy to know that certain specially created maintenance categories will give a high output, and do a lot of work, but if a user just wants to request a "normal" category they can't know how many % of the pages will actually get changed beforehand.
I agree capacity should be increased, more jobs at the same time would be such a good thing, however deciding that one page might be more important than another does not fix the root cause.
I do think there might be something going for giving priority to people that request a single (or low amount of) page(s). A person could be running the most important category that exists, but if I just want to have a single page check about a topic that I am knowledgeable about or have a big interest in, it is a hard swallow waiting for multiple thousand page jobs to finish. This actually has made me just give up a few times, leaving pages that could have been fixed and checked (with my knowledge about said subject) broken, I'm sure many can recognise themselves in this.
It is indeed important to fix high priority pages, and especially improve capacity, but lets not forget about the people who edit on topics that they enjoy, and just want to use the bot on something that might not be important according to some maintenance category, but is important to them. The more people that want to keep using the bot, the better! Redalert2fan (talk) 00:22, 10 August 2021 (UTC)[reply]
@Redalert2fan: I think you are missing my point, which is that there is no benefit to anyone in having the bot processing lots of articles where there is nothing for it to do. No matter how important anyone thinks an article is, there is no gain in having the bot spend ages deciding that there is nothing to.
The reason that single pages get locked out is because its capacity is being used up by these speculative trawls, by which I mean simply that they are not categories selected because they concentrate article which need the bot's attention -- these are "see if you find anything" batches, rather than "cleanup the problem here" batches.
One or two editors are repeatedly feeding it big categories on a huge variety of topics simply because they are big categories which fit under the 4,400 limit for categories. I have analysed the results, and in many cases the result is that only 10%–15% of the pages are edited, and only about half of those have non-trivial changes. So about 95% of the time that the bot spends on these huge categories is completely wasted.
When a resource is limited, it is best used by prioritising pages which have been selected on the basis that there is a high likelihood of something to do. --BrownHairedGirl (talk) • (contribs) 00:44, 10 August 2021 (UTC)[reply]
I see, there is no denying that there is no gain in having the bot spend ages deciding that there is nothing to.
Wouldn't it be an even quicker fix to ask these few editors why they run these not so intermediately helpful categories, and notify them of the problems it causes, and the benefits that can be gained by not requesting any category? It seems more like operator error than a bot mistake, and limiting the bots abilities for something that is caused by a few users, seems maybe questionable?
I agree with the points you make, but I don't feel like we should limit hundred of potential editors that request pages with what you would describe as "less than optimal request" because of 2 or so people. Even though we are limited, I don't think we need a strict priority system. If someone random editor want's to request a hundred pages that have their interest, we can't expect everyone to know beforehand if their request is an optimum use of the system, and see if you find anything might be an honest use case still. Obviously if as you say specific editors that seem to know what they are doing use the bot in a way that basically blocks out others for little gain all the time, action should be taken.
Some sort of priority system might indeed be a good idea, whether if it is in the way of important maintenance categories, "the pages with a high likelihood of something to do" or just giving priority to small request etc. Though it has to be a priority system for some types of requests, not a limitation for all request in my opinion, especially if the problem comes from a very minor selection of users. Redalert2fan (talk) 01:11, 10 August 2021 (UTC)[reply]
Max category size shrunk by one-quarter. AManWithNoPlan (talk) 13:08, 10 August 2021 (UTC)[reply]
Thanks, AManWithNoPlan. That's helpful, but wouldn't it be better to reduce it to the same 2200 limit size as the linked-from-page limit? --BrownHairedGirl (talk) • (contribs) 14:32, 10 August 2021 (UTC)[reply]
@Redalert2fan: Since early July, when I started using Citation bot in to clean up bare URLs, I have seen two editors repeatedly using the bot unproductively.
One was JamCor, who was using the bot to process the same set of almost 200 articles 3 or 4 times per day. Many of the articles were huge, taking several minutes each to process, so i estimated that about 20% of the bot's time was being taken on those 200 articles. I raised it on JamCor's talk, and they stopped, but only after the second request.
The other is Abductive, with whom I raised the problem several times on this page: see User talk:Citation_bot/Archive 27#throttling_big_category_runs. Sadly, that one persists, and I gave up making my case. When I started writing this post a few hours ago in response to you, I analysed the then-current recent contribs of the bot. Abductive had started the bot scanning Category:Use dmy dates from August 2012,and by then the bot had processed 1433 of the category's 4379 pages, but had saved an edited on only 141 of them, i.e. less than 10%. A with many of Abductive's previous big runs, I can't see any way in which this run could have been selected as a set which concentrates articles of interest to Abductive, or which concentrates articles of high-importance, or which concentrates articles that have been identified as being likely to have problems this bot can fix. The only criterion which I can see for its selection is that its size (4379 pages) is very close to the 4400 maximum size of Citation bot category jobs. A quick glance at the parent Category:Use dmy dates shows that very few categories are so close to the limit without exceeding it.
So AFAICS,the only reason for selecting this job was that it is a big set of articles which can be thrown at the bot with no more effort than cop-pasting the category title. I may of course have missed something, and if so I hope that Abductive will set me right. --BrownHairedGirl (talk) • (contribs) 14:33, 10 August 2021 (UTC)[reply]
I meant cut to one fourth, not cut by one fourth. So, category is now half the linked pages API. AManWithNoPlan (talk) 14:37, 10 August 2021 (UTC)[reply]
Cut to 1100 items! This is extreme. Grimes2 (talk) 14:53, 10 August 2021 (UTC)[reply]
@AManWithNoPlan: thanks. 1100 is much better.
However, that will reduce but not eliminate the problem of an editor apparently creating bot jobs just because they can. Such jobs will now require 4 visits to the webform in the course of a day, rather than just one, but that's not much extra effort. --BrownHairedGirl (talk) • (contribs) 15:13, 10 August 2021 (UTC)[reply]
People using the bot is not a problem. Abductive (reasoning) 18:02, 10 August 2021 (UTC)[reply]
Indeed, people using the bot is not a problem.
The problem is one person who repeatedly misuses the bot. --BrownHairedGirl (talk) • (contribs) 18:27, 10 August 2021 (UTC)[reply]
It is not possible to misuse the bot. Having the bot make the tedious decisions on what needs fixing is far more efficient than trying to work up a bunch of lists. Unfortunately, even the best list can be ruined if the API that the bot checks happens to be down. This is why it is it inadvisable to create lists that concentrate on one topic. Abductive (reasoning) 19:49, 10 August 2021 (UTC)[reply]
Bot capacity is severely limited. There is no limit to how much editors can use other tools to make lists, so that makes more efficient use of the bot.
Think of the bot like a hound, which is far more effective at finding quarry if started in the right place. The hound will waste a lot of time if started off miles away from the area where previous clues are.
Lots of other editors are targeting the bot far more effectively than your huge category runs. --BrownHairedGirl (talk) • (contribs) 22:11, 10 August 2021 (UTC)[reply]
Hey BrownHairedGirl, I agree with your ideas, but in the end there are no rules for what the bot can be used for so calling it misuse isn't a fair description. Anyone is allowed to use it for anything. Abductive can request what he wants, and creating bot jobs just because you can is allowed. In my eyes every page is valid to check (provided it isn't just a repeat of the same page or groups of pages frequentely).Redalert2fan (talk) 00:13, 11 August 2021 (UTC)[reply]
Just to be sure, whether that is the optimal way to use the bot or not is still a fair point of discussion. Redalert2fan (talk) 00:17, 11 August 2021 (UTC)[reply]
The question of self-restraint by users of an unregulated shared asset is a big topic in economics.
The article on the tragedy of the commons is an important read. It's well written but long. if you want a quick summary, see the section #Metaphoric meaning.
In this case, it would take only 4 editors indiscriminately feeding the bot with huge sets of poorly-selected articles to create a situation where 90% of the bots efforts changed nothing, and only 5% did anything non-trivial. That would be a tragic waste of the fine resource which the developers and maintainers of this bot have created, and would soon lead to calls for regulation. The question now is whether enough editors self-regulate to avoid the need for restrictions. --BrownHairedGirl (talk) • (contribs) 05:30, 11 August 2021 (UTC)[reply]
@AManWithNoPlan: the new limit of 1100 does not seem to have taken effect; see this[1] at 18:07, where the bots starts work on category of 1156 pages.
That may be due to expected delays in how things get implemented, but I thought it might help to note it. --BrownHairedGirl (talk) • (contribs) 18:51, 10 August 2021 (UTC)[reply]
Bot rebooted. AManWithNoPlan (talk) 20:15, 10 August 2021 (UTC)[reply]
Max category cut again to 550 and now prints out list of category pages so that people can use linked pages API instead, which also means that if the bot crashes the person can restart it where is left off instead of redoing the whole thing as it does with category code. AManWithNoPlan (talk) 20:25, 10 August 2021 (UTC)[reply]
Great work! Thanks. --BrownHairedGirl (talk) • (contribs) 22:02, 10 August 2021 (UTC)[reply]

It seems that the low-return speculative trawls have re-started. @Abductive has just run a batch job of Category:Venerated Catholics by Pope John Paul II; 364 pages, of which only 29 pages were actually edited by the bot, so 92% of the bot's efforts on this set were wasted. The lower category limit has helped, because this job is 1/10th of the size of similar trawls by Abductive before the limit was lowered ... but it's still not a good use of the bot. How can this sort of thing be more effectively discouraged? --BrownHairedGirl (talk) • (contribs) 11:57, 27 August 2021 (UTC)[reply]

A number of editors have pointed out to you that using the bot this way is perfectly acceptable. In addition, there are almost always four mass jobs running, meaning that users with one article can't get access to the bot. A run of 2200 longer articles takes about 22 hours to complete, so if I had started one of those, it would have locked such users out for nearly a day. By running a job that lasted less than a hour, I hoped that requests for smaller and single runs could be accommodated. And, in fact, User:RoanokeVirginia was able to use the bot as soon as my run completed. Abductive (reasoning) 18:14, 27 August 2021 (UTC)[reply]
@Abductive: on the contrary, you are the only editor who repeatedly wastes the bot's time in this way. It is quite bizarre that you regard setting the bot to waste its time as some sort of good use.
On the previous two occasions when you did it, the result was that the limits on job size were drastically cut. --BrownHairedGirl (talk) • (contribs) 18:47, 27 August 2021 (UTC)[reply]
That was in response to your complaints. Since I ran a job that was within the new constraints, I was not misusing the bot. You should request that the limits be increased on manually entered jobs, and decreased on category jobs. There is no particular reason that 2200 is the maximum. Abductive (reasoning) 18:52, 27 August 2021 (UTC)[reply]
@Abductive: you continue to evade the very simple point that you repeatedly set the bot to do big jobs which achieve almost nothing, thereby displacing and/or delaying jobs which do improve the 'pedia. --BrownHairedGirl (talk) • (contribs) 19:04, 27 August 2021 (UTC)[reply]
Using the bot to check a category for errors is an approved function of the bot. The fundamental problem is the limit of 4 jobs at a time. Also, the bot is throttled to run considerably slower than it could, which is a holdover from the time when it was less stable. The various throttlings, which as I recall were implemented in multiple places, should be re-examined and the bot re-tuned for its current capabilities. Abductive (reasoning) 19:11, 27 August 2021 (UTC)[reply]
This is not complicated. Whatever the bot's speed of operation, and whatever the limit on concurrent jobs, its capacity is not well used by having it trawl large sets of pages where it has nothing to do. I am surprised that you repeatedly choose to ignore that. --BrownHairedGirl (talk) • (contribs) 19:19, 27 August 2021 (UTC)[reply]
I am not ignoring anything. Bots exist to do tedious editing tasks. Your notion that editors have to do the tedious work before giving the bot a task is contrary to the purpose of bots. A number of proposals have been put forward to improve bot performance or relieve pressure on the bot, such as allowing multiple instances of the bot, or allowing users to run the bot from their userspace. These proposals have not been implemented. As the bot is currently configured, there will always be load problems. Abductive (reasoning) 19:29, 27 August 2021 (UTC)[reply]
Load problems that you are exacerbating. We've requested a million times to have better scheduling, or more ressources, but no dice so far. You're cognizant there's an issue, and you yet repeatedly feed the bot low-priority low-efficiency work. That's pretty WP:DE / WP:IDIDNTHEARTHAT behaviour from where I stand. Headbomb {t · c · p · b} 19:34, 27 August 2021 (UTC)[reply]
I have been holding off of all different kinds of runs lately. Check the bot's edits for the last week or so. Abductive (reasoning) 00:58, 28 August 2021 (UTC)[reply]
Abductive, increasing the bot's capacity would:
  • require a lot of work by the editors who kindly donate their time to maintain and develop this bot. WP:NOTCOMPULSORY, and they should by pressed to donate more time. Their efforts are a gift from them, not a contract.
  • exacerbate to some extent the usage limitations of the external tools which the bot relies on. Increasing the speed of the bot's operation will mean that those limits are encountered more frequently.
The bot will probably always have load problems, because there is so much work to be done.
Two examples:
  1. Headbomb's jobs of getting the bot to cleanup refs to scholarly journals. That is high-value, because peer-reviewed are the gold standard of WP:Reliable sources, and it is also high labour-saving because those citations are very complex, so a big job for editors to fix manually. It is high-intensity work for the bot because many of the articles have dozens or even hundreds of citations. I dunno Headbomb's methodology for building those jobs or what numbers can be estimated from that, but assume that tens of thousands of such pages remain to be processed.
  2. my jobs targeting bare URLs are focused on a longstanding problem of the core policy WP:V being undermined by linkrot, which may become unfixable. I have lists already prepared of 75,000 articles which need the bot's attention, and have a new methodology mostly mapped out to tackle about 300,000 more of the 450k more articles with bare URL refs.
My lists are (like Headbomb's lists) all of bot-fixable problems, so they don't waste the bot's time, but they do not tackle such high-value issues as Headbomb's list, so I regard mine as a lesser priority than Headbomb's.
So whatever the bot's capacity, there will be enough high priority high-efficiency work to keep it busy for a long time to come. it is not all helpful for that work to be delayed or displaced because one editor likes to run big job but doesn't like doing the prep work to create productive jobs.
In the last few weeks I have approached 4 editors about what seemed to me to poor use of the bot.
Only Abductive persists. --BrownHairedGirl (talk) • (contribs) 20:57, 27 August 2021 (UTC)[reply]
Reading the discussion above, I think that this issue is becoming increasingly adversarial and perhaps a concrete proposal for action would be a way to fix this.
This could include:
1) If it is an easy technical fix (the maintainers would need to chime in on this), bringing the PHP issue to someone with tool service access and increase the bots capacity
2) Adopt a definition/policy on "speculative trawling", perhaps with a notice on the bot page to nudge users into considering the bots limited capacity.
3) Any other ideas?
@Abductive @BrownHairedGirl @Headbomb @Redalert2fan RoanokeVirginia (talk) 23:11, 27 August 2021 (UTC)[reply]
Other ideas:
  1. The idea of revisiting the bot's basic tuning on its dwell times, throttling and other patches now that stability has been improved deserves consideration. If the bot could run just a touch faster at every step, the overall load would be reduced. Abductive (reasoning) 00:46, 28 August 2021 (UTC)[reply]
  2. Set aside one of the four channels for single use. Abductive (reasoning) 00:46, 28 August 2021 (UTC)[reply]
    @Abductive: setting aside one channel for single use wouldn't help much, because in addition to huge batch jobs, Abductive has been simultaneously being flooding the bot with masses of individual page requests. See e.g. this set of 1,500 bot edits. By using my browser's Ctl-F to search in the page, I find that Abductive | #UCB_webform matches 370 edits (a batch job of 2200 pages), and a search for Abductive | #UCB_toolbar matches a further 114 pages.
    So basically, Abductive has been WP:GAMING the limits by occupying one of the bot's 4 channels with a batch job, and then monopolising another channel with some sort of systematic flood of single jobs. (This also happened on at least one pervious day this week).
    Note those 114 single use edits (the toolbar edits) are only the pages which were actually edited. It is likely that there were other toolbar requests which did not lead to the bot making an edit.
    Note also that the 114 toolbar edits spanned a period from 00:58, 28 August 2021 to 08:36, 28 August 2021. That is after @Headbomb's warning[2] at 19:34 27 August about WP:DE and WP:IDHT behaviour ... and it is 18 days after @AManWithNoPlan cut the limit on category batches twice in response to Abductive's abuse of the higher limit.
    It is also 28 days since I rejected Abductive's offer to be a meatpuppet for me (see User talk:Abductive/Archive_21#Talk_page_usage) by running some of my bareURL batches. That would have amounted to my jobs effectively taking two of the bot's four channels, which would be WP:GAMING the bot's scheduler and impeding other editor's access to the bot.
    This looks to me like intentional WP:DE, which will not be resolved by anything along the lines @RoanokeVirginia's thoughtful proposals. The only solution I can see here is some sort of restraint on Abductive's use of Citation bot. --BrownHairedGirl (talk) • (contribs) 18:19, 28 August 2021 (UTC)[reply]
  3. Request a Citation bot 2. I think this is something any one of us can do, correct? Abductive (reasoning) 00:46, 28 August 2021 (UTC)[reply]
Thanks for the thoughtful suggestions, @RoanokeVirginia.
  1. A capacity increase would be very welcome, but it is very unlikely that any remotely feasible capacity increase could remove the need to use that capacity efficiently and effectively. So Abductive's conduct would still be a problem
  2. a statement on effective use of the bot sounds like a good idea, but I don't expect that a nudge would have any impact on an editor who has repeatedly rejected nudges.
Having been using AWB heavily for 15 years, and run several bot tasks, I am used to the fact that big jobs usually require a lot of preparation if they are to be done efficiently and accurately. So doing prep work and test runs before feeding a big job to Citation bot is second nature to me, and probably also to Headbomb.
By contrast, Abductive appears to want to A) continually set the bot off on big runs with no clear purpose or selection criteria, just because they can, and B) objects to any prep work. The problem is the combination of A and B. Many other editors use Citation bot without extensive prep work, but they usually do so for short bursts or for bigger jobs targeted on a particular topic area. The problem with Abductive's jobs is that they lock up one of the bot's job slots near permanently, often with very low return, for no apparent reason. --BrownHairedGirl (talk) • (contribs) 15:35, 28 August 2021 (UTC)[reply]
I don't object to prep work, I'm just pointing out that it is inefficient. If, as you say above, you have tens to hundreds of thousands of bare url articles to run by the bot, shouldn't you be accommodated somehow? I have been running 2200 of your bare urls at a time when it looks like the load on the bot is low, and holding off on runs when it looks like the load is high. Abductive (reasoning) 18:40, 28 August 2021 (UTC)[reply]
@Abductive: I have not supplied you with a list of bare URLs, so you could not have have been running 2200 of your [i.e. BHG's] bare urls. You rejected my suggestion (at User talk:Abductive#A_citation_bot_job) that you tackle the pages which transclude {{Bare URL inline}}, and its transclusion count has dropped by only 84 in the last 25 days, so I doubt you were tackling that.
But my main point here is that you have been WP:GAMING the system by flooding the bot with hundreds of toolbar requests while also running a big batch job, so your holding off claim seem to me to be bogus. --BrownHairedGirl (talk) • (contribs) 19:07, 28 August 2021 (UTC)[reply]
As a matter of fact I have been running from {{Bare URL inline}}, and nobody is gaming the system. Using the bot for its intended purposes is legitimate. Abductive (reasoning) 19:14, 28 August 2021 (UTC)[reply]
@Abductive: Gaming the bot's limits is not legitimate.
And I see no evidence of your claim that you have been targeting {{Bare URL inline}}. Can you provide any evidence of that claim? --BrownHairedGirl (talk) • (contribs) 19:21, 28 August 2021 (UTC)[reply]
I drew a list from Category:CS1 errors: bare URL and have been running them 2200 at a time. Just now I requested a run of 2200 drawn directly from {{Bare URL inline}}. See if that brings down your metric. Don't forget that the bot isn't very good at fixing bare urls. Abductive (reasoning) 20:10, 28 August 2021 (UTC)[reply]
@Abductive: the bot's latest list of contribs shows a batch of 220 page submitted by you.
Your use of a webform rather than links from a page impedes scrutiny, because only the pages which have actually been edited are visible to other editors. However, checking the recent bot contribs for the pages which have been edited, the list appears to consist partly of pages which transcluded {{Bare URL inline}}, e.g. [3],[4],[5] ... but also some pages which appear to have never transcluded {{Bare URL inline}}, e.g. COVID-19 vaccination in Malaysia (see bot edit [6], and the unsuccessful WikiBlame search for "Bare URL inline"[7]).
So your current batch is a mix of {{Bare URL inline}} and something else ... and the fact that you have started that batch now does not alter the fact that the evidence so far indicates that your claim to have been processing batches of 2200 {{Bare URL inline}} pages is false. My bare URL cleanup jobs average about 30% of pages being completed cleared, and the first 100 of your latest batch shows 46/100 having bare URLs replaced. So if you had previously run batches of 2200 {{Bare URL inline}} pages, we should have seen multiple instances of 700 pages being cleared. That has not happened.
And you have still not addressed the issue of how you have been WP:GAMING the bot's limits by stashing up hundreds of toolbar pages whilst running a big batch. --BrownHairedGirl (talk) • (contribs) 20:53, 28 August 2021 (UTC)[reply]
I have been running the bot on thousands of members of the bare url category, and just now on the list of transclusions of {{Bare URL inline}}. Why there should be a difference in its ability to clear one more than the other I cannot say. Abductive (reasoning) 21:04, 28 August 2021 (UTC)[reply]
@Abductive: Thank you for finally confirming that your claims to to have previously processed batches of 2200 pages transcluding {{Bare URL inline}} were false.
Making repeated false assertions is not collaborative conduct. --BrownHairedGirl (talk) • (contribs) 21:12, 28 August 2021 (UTC)[reply]
I have been running the bot on thousands of bare urls, for you. I did not deliberately make a false statement. Abductive (reasoning) 21:16, 28 August 2021 (UTC)[reply]
@Abductive: So the evidence I can see indicates that your statements are false.
I have not provided you with any lists of articles with bare URLs. You have been processing pages in Category:CS1 errors: bare URL, which is a related-but-different issue which you chose. And in the last few hours, you started processing pages one batch of 2200 pages, some of which transclude {{Bare URL inline}} ... but until this evening, I see no other such batch.
I have asked you for evidence to support your claims, and you have provided none. If you wish to sustain your assertion that you have previously submitted batch jobs of pages which transclude {{Bare URL inline}}, please provide some diffs of the bot's edits in some batches, or better still some contribs lists which show multiple bot edits on those batches. --BrownHairedGirl (talk) • (contribs) 21:42, 28 August 2021 (UTC)[reply]
Nobody is misusing the bot. Requesting a category run is a function of the bot. Requesting individual runs of articles is a major purpose of the bot, and it should be noted that if all four channels are in use, nobody can request the bot to run on an article they just created. Just about any random category run these days will only achieve 30%; so what does it matter if it a big or small category? You are running the bot on a collection of articles in which you are achieving only 30% of your goal of fixing bare urls. But you are not misusing the bot either, or gaming the system by running as close to 2200 as you can each time. Abductive (reasoning) 21:04, 28 August 2021 (UTC)[reply]
@Abductive: it is hard to assess whether you misunderstand he issue being discussed, or whether you are being intentionally evasive.
Indeed, Requesting individual runs of articles is a major purpose of the bot ... but your rapid requests of hundreds of individual articles within a few hours while you are already running a max-size batch is in effect running a second batch, and thereby WP:GAMING the bot's limits. If that is not clear to you, then we have a big problem.
Your assertion that Just about any random category run these days will only achieve 30% is demonstrably false in two respects:
  1. this thread has plenty of evidence of you feeding random categories to the bot, and getting edit rates of only ~10%, including one category with only a 7% edit rate.
  2. you are conflating two difft percentages. My bot runs of pages with bare URLs usually clears on a first pass all the bare URLs on about 30% of the pages. However, on other pages in the set, it removes some of the bare URLs and/or makes other changes. So the total edit rate is usually well over 50%.
This is beyond ridiculous. --BrownHairedGirl (talk) • (contribs) 21:32, 28 August 2021 (UTC)[reply]
The fact that your bare urls get above 50% is one of the reasons I am assisting you in running them 2200 at a time. Individual requests fix errors, and do more to hold open a channel (from runs) than to clog a channel. Occassional low productivity runs on categories is a matter of bad luck. My runs are in no way impeding your efforts, and in fact are helping them, so you should not be at all concerned about my use of the bot. Abductive (reasoning) 21:42, 28 August 2021 (UTC)[reply]
@Abductive: This is ridiculous.
  1. No, you are not running my bare URL jobs 2200 at a time. A I explained above, you are not running em at all: Category:CS1 errors: bare URL is a separate issue.
  2. . But hundreds of rapid individual jobs do clog a channel, as surely as if they were batch. They are in effect a second channel.
  3. My runs are in no way impeding your efforts. Again, nonsense: your routine use of two channels slows down the bot, and impedes other editors from even starting new jobs.
  4. so you should not be at all concerned about my use of the bot. This is a severe case of WP:IDHT.
I fear that this may have to be escalated. --BrownHairedGirl (talk) • (contribs) 22:00, 28 August 2021 (UTC)[reply]
Don't you think your efforts should be directed towards improving this and other bots, and not at a misguided effort to get me to stop using the bot "wrong"? Abductive (reasoning) 22:05, 28 August 2021 (UTC)[reply]
I would dearly love to be able to focus on my work, rather than having to divert my efforts into trying dissuade one WP:IDHT editor from persistently disrupting my work and other's work by systematically abusing this bot and repeatedly making misleading and/or false assertions when i attempt to discuss the problem. --BrownHairedGirl (talk) • (contribs) 22:55, 28 August 2021 (UTC)[reply]
I never deliberately made false assertions, and I am not abusing the bot or disrupting anything. I am using the bot as intended, and doing a pretty good job of it. Abductive (reasoning) 23:04, 28 August 2021 (UTC)[reply]
If your false assertions were unintentional, then then I fear that you may not comprehend the issues adequately.
The bot has twice had to be reconfigured to prevent your abuse, so you are demonstrably not using it as intended,
And if you really think that repeatedly wasting the bot's time by feeding it huge categories which need almost no bot work is what the bot was intended for, then you have a very odd idea of what bots are intended for. --BrownHairedGirl (talk) • (contribs) 23:12, 28 August 2021 (UTC)[reply]
Reconfigured to make the bot use its 4 channels more effectively. And a review of the totality of my requests will show that I am getting above average results. Abductive (reasoning) 23:40, 28 August 2021 (UTC)[reply]
No, you have not Reconfigured to make the bot use its 4 channels more effectively. All you have done is to selfishlyand disruptively grab 2 of the 4 channels for yourself. This makes the bot work no faster, and use its time no more effectively; your WP:GAMING of the limits just gives you a bigger slice of the bot's time and denies other editors access to the bot.
You offer no evidence at all of the claimed effectiveness of your request; that is just an empty boast. It is not possible to assess that without access to the bot's logs, because the bot's contribs lists provide no way of telling whether any of your single requests led to no edit. For all we can see from the contribs list, it may well be that the edit rate for your single requests is as low as the 7% you got from a category. --BrownHairedGirl (talk) • (contribs) 04:20, 29 August 2021 (UTC)[reply]
At no point am I selfishly or disruptively grabbing channels. First off, that's not even possible, and second, the bot is improving the encylopedia when it makes corrections. Additionally, they often happen to be bare url articles, a project of yours. How is that selfish? Nobody is calling you selfish for running thousands of articles past the bot. Abductive (reasoning) 06:00, 29 August 2021 (UTC)[reply]
You are cherry-picking evidence. Why not mention my recent category run that got lucky and found and corrected at a 70% rate? And to suggest that a handful of individual requests makes it immpossible to assess my overall rate is silly, it is easy to assess my overall rate using just the category and multiple runs. But why are you even interested in my activities? As I said before they are not impeding your runs, and in fact I am running your bare url finds. Right now, for instance, I am not running anything, because the bot seems to be heavily loaded. Abductive (reasoning) 06:00, 29 August 2021 (UTC)[reply]

Please find another place to argue. Thanks. --Izno (talk) 16:32, 29 August 2021 (UTC)[reply]

@Izno: surely this is the appropriate place to address systematic misuse of the bot? --BrownHairedGirl (talk) • (contribs) 17:31, 29 August 2021 (UTC)[reply]
You said it yourself, you may need to escalate. I am suggesting you do so, because it seems clear to me that you will not be able to persuade Abductive here. Izno (talk) 18:08, 29 August 2021 (UTC)[reply]
Fair enough, @Izno. When it recurs, I will escalate. --BrownHairedGirl (talk) • (contribs) 20:11, 29 August 2021 (UTC)[reply]

One editor, two simultaneous batch jobs

Yet again, @Abductive is hogging the bot by running two batch jobs simultaneously.

The latest bot contribs shows that the bot is simultaneously processing both:

  1. Category:Gambling terminology (63 pages), with an edit rate so far of 32/61 pages, i.e. 53%
  2. Category:Papermaking (88 pages), with an edit rate of 17/88 pages, i.e. 19%

More low-return speculative trawling, using half the bot's capacity, delaying jobs targeted at bot-fixable issues, and locking out single-request jobs. This is more WP:DE. --BrownHairedGirl (talk) • (contribs) 15:19, 29 August 2021 (UTC)[reply]

I requested the first one, and nothing happened for a very long time. I entered the second one, and nothing happened for a very long time. I have entered more, and still nothing has happened. All of a sudden, both started running. Abductive (reasoning) 15:27, 29 August 2021 (UTC)[reply]
I am going to have to ask you to stop complaining about my so-called speculative trawling. Using the bot to clean up a category's citations is a legitimate use of the bot, and it is not a problem. Please strike your accusation of disruptive editing. Abductive (reasoning) 15:27, 29 August 2021 (UTC)[reply]
On the contrary, your long pattern of using the bot disruptively has been well-documented. This is just the latest round of the saga. I will stop complaining about your disruption when you stop the disruption.
Why were you entering a second category before the first one had even started processing? Are you trying to stash up batch jobs in some sort of private queue? --BrownHairedGirl (talk) • (contribs) 15:59, 29 August 2021 (UTC)[reply]
You are accusing me of acting in bad faith and of being disruptive, when in fact it is some sort of error with the bot. Please desist. Abductive (reasoning) 16:09, 29 August 2021 (UTC)[reply]
Please stop being disruptive, and please stop WP:GAMING the bot's queuing system.
The error in the bot's functionality was simply that it failed to block your gaming of the queue, a practice which you have already acknowledged above. If you were not gaming the queue, that bot functionality would be irrelevant. --BrownHairedGirl (talk) • (contribs) 16:13, 29 August 2021 (UTC)[reply]

As I said above, please find another place to argue. Thanks. --Izno (talk) 16:38, 29 August 2021 (UTC)[reply]

Izno, as i replied above, surely this is the place to raise misuse of the bot? --BrownHairedGirl (talk) • (contribs) 17:41, 29 August 2021 (UTC)[reply]
BrownHairedGirl, What most of us (in my opinion I feel) are getting at is that you have problems with 1 specific users actions, and this discussion is only between the 2 of you. While it is about how the bot is used, it seems to have long passed anything to do with the edits of the bot, what operators should do etc.. I think originally starting a discussion here was a good idea, as it should have been done, and that let to most people reconsidering the use of the bot, however this has turned in to a back and forward situation that nobody here can help. It's not about the bot anymore, it reads as a conflict between 2 people.
Now apparently this specific section shows a technical problem "One editor, two simultaneous batch jobs" It is great to report that, as it should not be possible. this is exactly the place to report technical problems.
But instead of just sticking to a detail about the problem, it instantly devolved in to back and forth accusations.
Summary: It was a good place to start a discussion about bot usage due to limit resources which should also be fixed, but it is now just a conflict between 2 people. Suggestion: If you find technical issues just report them, and wait for the operator to respond, if its also combined with what you see as "misuse" it seems we have passed this venue and should be discussed elsewhere. Redalert2fan (talk) 13:34, 30 August 2021 (UTC)[reply]
@Redalert2fan: the only impact of Abductive's systematic misuse of the bot is to impede other editor's use of the bot, so that belongs on the page where editors are discussing how to use the bot.
See the section below #Citation_bot, where Fade258 posted[8] about the bot not having processed a page after several hours of waiting. That happened because an earlier round of Abductive's gaming of the bot's queuing system had blocked new jobs from starting. Fade258's post didn't do a great job of explaining the problem they had encountered, but this was the right place to raise the no-response-from-bot problem. Editors such as Fade258 would be much less likely to find the bot overloaded if we didn't have one editor systematically occupying 2 of the 4 entries. --BrownHairedGirl (talk) • (contribs) 18:06, 30 August 2021 (UTC)[reply]
I am very sorry that the bot somehow allowed me to make more than one run at a time. But I did not "game" anything, and your continued accusations amount to incivility. Abductive (reasoning) 21:58, 30 August 2021 (UTC)[reply]
My accusations are both a) evidenced, and b) supported by your own assertions. The technical issue is that the bot mishandled to block your request for a second batch before the first one had even started; but that technical issue arose only because of your lack of restraint. --BrownHairedGirl (talk) • (contribs) 23:02, 30 August 2021 (UTC)[reply]
You asked if this is the place, I explained why I think it is not. I will leave it at that, I have no problem if you have issues with the edits of abductive, but if you have to fight back me as well on a question that you asked, and get another hit on abductive like that I'm no longer interested in explaining. You picked one specific section of my comment out just to be uncivil towards someone else. I Urge you to reconsider as well, it is looking for me increasingly like the bot should only do what you want.
And this is just the comment from me I think should not be posted on this page, but I have to do. Redalert2fan (talk) 22:40, 30 August 2021 (UTC)[reply]
@Redalert2fan: my reply to you is not a "fight", and it was in no way uncivil.
And please AGF. I do not in any way believe that the bot should only do what I want; my objections are to misuses which I have documented thoroughly. --BrownHairedGirl (talk) • (contribs) 22:57, 30 August 2021 (UTC)[reply]
And as I said previously, I do not agree with the term misuse being used, and now it is thrown out like a buzzword all the time, but I specificaly left out my own view on this this time above because I already know 0 progress is going to be made.
If I'm honest, whenever I want to use the bot its not abductive that is blocking out people anymore, you are taking up one of the slots all the time as well, claiming that your edits are the most important. I think that everyone should be allowed the to use this tool if they think they can improve something with it, and not keep being targeted on their edits. "AGF" huh, how about aplying this to everyone that uses the bot, you didnt do that above with abductive either. Redalert2fan (talk) 23:12, 30 August 2021 (UTC)[reply]
@Redalert2fan: I do not use the word misuse lightly. I do use it in this case, because I have clear evidence, documented above, that Abductive has been repeatedly misusing the bot in two ways: a) repeatedly flooding it with lists of article which lead to very low rates of change; b) repeatedly gaming the bot's queuing system to get the bot to simultaneously run two batch jobs for Abductive. Before you dismiss my complaints, please read the evidence, which is spread over several weeks in threads above.
Your assertion that I am claiming that your [BHG's] edits are the most important is demonstrably false. At no point have I made or implied any such claim. On the contrary, I have repeatedly stressed the importance of getting bot to do actual edits rater than processing pages which need no changes, and I have explicitly stated that I regard the academic journal refs targeted by Headbomb as a more important task.[9]. So please strike that slur.
Yes, I am usually taking up one of the bot' slots. But
  1. I take up one of the bot's slots, whereas Abductive has been systematically flooding the bot with two simultaneous tasks (see evidence here[10])
  2. Unlike Abductive, I do not set the bot off on large unproductive speculative trawls. Note that jobs which I ask the bot to do are highly visible because the bot has work to do on most pages; Abductive's low-return speculative trawls leave less trace because the bot' contribs lists does not log pages where th boy makes no edits. Note also that Abductive actually defends their long history of wasting the bot's time on huge unproductive trawls: see e.g. [11] Bots exist to do tedious editing tasks. Your notion that editors have to do the tedious work before giving the bot a task is contrary to the purpose of bots.. Those unproductive, speculative trawls by Abductive repeatedly tie up 1/4 of the bot's capacity for days on end, because Abductive will not restrict their use of the bot to batches which actually achieve something. Yet Redalert2fan somehow thinks it is very uncivil of me to point this out.
I remind you that WP:AGF explicitly says this guideline does not require that editors continue to assume good faith in the presence of obvious evidence to the contrary. In the case of Abductive, I have documented that obvious evidence to the contrary.
If you have concerns that too much of the bot's time is taken up by targeted batch jobs of large sets of pages with identified bot-fixable problems, such as those suggested by me, Headbomb, or Josve05a, then please open a discussion on that issue. I think it would be helpful to have such a meta-discussion about how to balance the bot between batch tasks and individual jobs. But please stop allowing your concerns about that to make false accusations about my motives and goals ... an please stop conflating bot overload with efforts by Abductive to use the bot unproductively and to actively game its limits. --BrownHairedGirl (talk) • (contribs) 00:21, 31 August 2021 (UTC)[reply]
I do not feel this is productive in any way nor do I think this is the right place for this as I said before. I'm not interested in clogging this page up further either. I will leave it here. Sorry if this is not satisfactory, but I feel it is for the best. Redalert2fan (talk) 10:42, 31 August 2021 (UTC)[reply]

I've been following this with a little interest best this is on my watchlist. I've not followed everything. BHG grabs my attention due to the working of bare-URL identification bot runs. But it is reasonable to discuss whether a bot startup can be modified with an algorithm to avoid unfair usage; or perhaps documenation if further. I'm not in the greatest of standing at the moment, but Wbm1058 has a few hours left on my watchlist and couldn't help but notice the comment (I think its been their for a long time). ": Secret to winning the Race Against the Machine: become an expert bot programmer, and hope that the bots don't learn to program themselves. HAL?". That comments been there for a long time, got me to dreaming about becoming an "expert bot programmer" (I've got lost from when input ceased to be from 80 column punch cards so a bit too late in life methinks). Wbm1058's seems to know something about bots and might be able to comment/mediate here? Thankyou. Djm-leighpark (talk) 23:28, 30 August 2021 (UTC)[reply]

A couple of thoughts. Is abductive a primary maintainer of the bot in which case could reasonably request an exemption for two bots if only was for low maintenance. The other might be a lightweight new control/monitor bot to monitor the runners and queuers and if user A was running two or more bots and another user B had a scheduled run holding for some time the control bot would chop one of users A bots and reschedule it to start from the place it left of. Actually this would likely be beyond anyones capability to do in a reasonable time period. Sorry I come up with stupid questions. Thankyou. Djm-leighpark (talk) 01:34, 31 August 2021 (UTC)[reply]
someone who knows how to and has permission to so needs to increase the total number of PHP proceses from 2*2 to something bigger/ AManWithNoPlan (talk) 01:47, 31 August 2021 (UTC)[reply]
not to state the obvious, but shouldn't there be a place were we can find someone like that? Redalert2fan (talk) 12:00, 31 August 2021 (UTC)[reply]
multiple efforts have been made. I personally do not plan to try again. Not worth wasting my time. AManWithNoPlan (talk) 12:57, 31 August 2021 (UTC)[reply]
@AManWithNoPlan: If you have moment, please could you tell me whether my understanding is correct?
AIUI:
  1. the bot processes one page at a time, and its throughput is constrained by the time it takes the bot's core to parse each page, invoke zotero etc, then process the results and save any changes.
  2. there is only one instance of that core functionality
  3. the order in which pages are processed involves taking pages one at a time from each of the 4 PHP processes in turn (however such turns are allocated)
So AIUI adding more PHP processes would allow the bot's time to be shared between more users, but would not increase overall throughput. Is that correct? --BrownHairedGirl (talk) • (contribs) 22:10, 31 August 2021 (UTC)[reply]
All PHP jobs are seperate with no shared processing. We are way below our memory and cpu limits of the tool server with 4 processes. The URL expander is run by wikipedia, so that might be a bottleneck, but only for that step. AManWithNoPlan (talk) 01:15, 1 September 2021 (UTC)[reply]

@Legoktm: per this discussion is this process a violation of Toolforge Rule #6: "Do not provide direct access to Cloud Services resources to unauthenticated users"? Or has this been Toolforge admin vetted? wbm1058 (talk) 15:53, 1 September 2021 (UTC)[reply]

Everyone is authenticated. Headbomb {t · c · p · b} 15:55, 1 September 2021 (UTC)[reply]
Perhaps, but I'm hearing complaints that the bot "allows for anyone to make arbitrary queries". – wbm1058 (talk) 16:00, 1 September 2021 (UTC)[reply]
Where at? Headbomb {t · c · p · b} 16:09, 1 September 2021 (UTC)[reply]
BrownHairedGirl, you, and maybe others have complained that Abductive makes arbitrary queries. Essentially you seem to be complaining that Abductive launches "DoS". wbm1058 (talk) 16:14, 1 September 2021 (UTC)[reply]
"DDoS" on Citation Bot, maybe, not DDoS against Toolforge. Headbomb {t · c · p · b} 16:23, 1 September 2021 (UTC)[reply]
His requests did fix stuff, they are not vandalism or something like that. The concern from some was/is that due to the limited capacity the bot has they were not optimal, out of large request only a very small amount of pages were edited, the rest there was nothing, or only minor fixes were made. Currently it is better to prioritise what the bot is being used for, pages there is some certainty ahead that large or more important fixes will be done, and prevent the bot from spending time on checking pages there is nothing to fix. BrownHairedGirl seems to have developed a list of pages that fit these criteria.
Now if the capacity was larger there wouldn't be much of a problem with those requests from Abductive, since they did contain some pages that were fixed.
It is not a case of someone just filling up the queue with the same pages 24/7 nor knowingly feeding pages that have nothing to do to specifically block others from using the bot. Even though some like me feel like anyone should be able to request a page or list they want to have checked, It currently is just a case of how we all use this limited resource together in the most efficient way. Redalert2fan (talk) 16:29, 1 September 2021 (UTC)[reply]
@Redalert2fan: my complaint about Abductive is that they:
  1. routinely fill one of the bot's 4 channels with large sets of pages where the bot has v little to do. This waste of the bot's resources could be avoided by testing samples of these sets before processing the whole lot, but even tho I showed Abductive how to do that, they strenuously resist the principle of preparation work.
  2. repeatedly fill a second channel of the bot's 4 channels by piling up hundreds of individual page requests while the bot is already processing a batch of theirs.
    This works because the bot won't allow a user to run two batches simultaneously, but it does allow a user to submit an individual job while the batch is being processed. That's good bot design, because it allows an editor to submit a batch and not be locked out from checking an individual page that they have been working on ... but Abductive has deliberately gaming this system by submitting hundreds of individual pages while their batch is running.
@Wbm1058: I am not sure whether Abductive's use of the bot counts as DDoS. I don't think that Abductive's goal is to deprive others of the use of the bot, just that Abductive shows i) a reckless disregard for how the effect of their actions is to deny others use of the bot; ii) no particular interest in any type of article or any topic area, just a desire to feed lots of pages to the bot.
I agree with Redalert2fan that this is about how we all use this limited resource together in the most efficient way. I think we need some guidelines on how to do that, and I will try to draft something. --BrownHairedGirl (talk) • (contribs) 04:46, 6 September 2021 (UTC)[reply]
In an ideal world, the bot would only allow a maximum of ONE batch job at a time and that it would have a queue for submission of such batch jobs. It is immensely frustrating to 'ordinary' editors who, having spent a lot of time on an extensive cleaned up and want to get the citations checked, get no response or (eventually) a failure. Batch jobs are by definition not needed in real time and, until there is such a queuing mechanism, may I suggest that the guideline states an absolute ban on submitting more than one run before the preceding one has finished (subject to a 48 hour [72 hour?] timeout). If there is any way to encapsulate the rest of my frustration in the guideline, I would be most grateful. --John Maynard Friedman (talk) 10:39, 6 September 2021 (UTC)[reply]
@John Maynard Friedman: I share your frustration, which is why I have repeatedly raised this issue.
However, I think your suggested remedy is far too extreme.
I watch the bot contribs very closely, because I follow behind it cleaning up as many as I can of the pages where it seems not to have fixed any bare URLs.
It is quite rare for all 4 channels to be simultaneously in use by batch jobs, and so long as one channel is free then individual requests such as yours will be processed within a few minutes ... unless Abductive has gamed the system by stashing up dozens or hundreds of individual page requests. There is no need to limit the bot to one batch job at a time, and any attempt to do so would seriously and unnecessarily impede the bot's capacity to process batches.
I think that if there is some throttling of batch jobs, a distinction should be made between batches which have been selected because they concentrate bot-fixable problems (such as my sets of articles with bare URLs, or Headbomb's sets of articles citing scholarly journals) and what I call "speculative trawls", i.e. sets of articles selected on the chance that the bot may find something.
The speculative trawls by Abductive were a serious drain on the bot when the category limit was 4,400 pages and Abductive was stashing up requests for categories which fell just under that limit, often with very low returns. In response to that abuse, the size limit for categories was cut twice, initially to 1,100 and then to 550.
Abductive continues to run a lot of speculative trawls based purely on the size of the category. For example, this morning (see 2000 bot edits) they have fed the bot first Category:Articles with specifically marked weasel-worded phrases from August 2021 (497 pages) and then Category:Articles with specifically marked weasel-worded phrases from July 2021 (460 pages). Neither category concentrates pages by topic, so there is no topic-of-interest involved, and both categories are based on an attribute which the bot cannot fix. The only reason I can see for selecting them is that they fall just under the bot's category size limit.
The bot has not had many other jobs this morning, so Abductive's speculative trawls haven't significantly delayed anything else. But at most other times, this use of the bot does create bottlenecks.
We don't need to hobble the bot's throughput of targeted batch jobs just to restrain one editor who shows reckless disregard for the effect on others. --BrownHairedGirl (talk) • (contribs) 13:19, 6 September 2021 (UTC)[reply]
Instead of uselessly complaining about prefectly legitimate uses of the bot, concerned senior editors should go to the Village Pump and request that this be fixed in a way that the bot operator has suggested, but which is outside his control. Abductive (reasoning) 23:17, 7 September 2021 (UTC)[reply]
Sigh.
Gaming the queuing system to deprive other editors of prompt processing of their requests is not a legitimate use of the bot.
Repeatedly wasting the bot's time on low-return speculative trawls is not a legitimate use of the bot.
Regardless of what improvements might be made to the bot, Wikipedia is a collaborative project, and editors should not systematically disrupt the work of others. --BrownHairedGirl (talk) • (contribs) 06:19, 8 September 2021 (UTC)[reply]
Repeatedly complaining here will accomplish nothing. The jobs I request from the bot have been fixing lots of high-readership articles, which is one of the metrics I use to select more runs. I have been refraining from starting runs when there are already three big runs going, but at some point new editors are going to discover the bot and then there will be times when there are four large runs going. It would be advisable for you to use your energy to cajole the technical folks at the Village Pump into doing something. Abductive (reasoning) 18:38, 8 September 2021 (UTC)[reply]
For goodness sake. Category:Articles with specifically marked weasel-worded phrases from August 2021 are collections of articles by a cleanup tag, not by readership. --BrownHairedGirl (talk) • (contribs) 20:17, 8 September 2021 (UTC)[reply]
True, but if you look at the articles in those categories, they tend to be ones that attract readers, attract users who add questionable material to them, and then attract editors who add the weasel word tags. And the bot is correcting a reasonably high percentage of them. Win-win. Abductive (reasoning) 20:23, 8 September 2021 (UTC)[reply]
According to the pageviews tool, the 492 pages in that category have a combined average pageviews of 413,000 per day, or 839 per page per day, and there are 217 articles in that category with less than 100 views per day, and 71 with less than 10 views per day. That's hardly high-readership, so the conclusion that articles with weasel word tags attract readers isn't true. * Pppery * it has begun... 00:52, 9 September 2021 (UTC)[reply]
Wikipedia gets about 255 million pageviews a day. There are 6,373,026 articles on Wikipedia (492/6,373,026)*255,000,000 = 19,686, which is 21 times less than 413,000. So you are wrong by more than an order of magnitude. A simple glance at the articles in the category would reveal that there are a lot of high-interest articles in there. For example, the bot just fixed Deborah Birx. Heard of her, right? Abductive (reasoning) 08:30, 9 September 2021 (UTC)[reply]
Cherrypicking one example out of dozens of sets of hundreds of articles is no help. And averages obscure the fact that as ppery has shown, a significant proportion of these articles have low page views.
If you want to set the bot to work on highly-viewed articles, then use a methodology which selects only highly-viewed articles, then work through it systematically, avoiding duplicates in your list and articles which the bot has already processed in the last few months. --BrownHairedGirl (talk) • (contribs) 01:04, 11 September 2021 (UTC)[reply]
My activities are helping build the encyclopedia. Removing duplicates is not the best idea, since the bot often takes more than one pass to make all possible corrections. My offer still stands; if you like, place a list of articles in User:Abductive/CS1 errors: bare URL, let me know, and I will run the bot on them. Otherwise you should not concern yourself with other users' legitimate uses of the bot. Abductive (reasoning) 03:06, 12 September 2021 (UTC)[reply]
Arguably, your activities are hurting the encyclopedia more than helping, by acting as a denial of service attack on Citation bot that prevents others from using it more effectively. —David Eppstein (talk) 04:11, 12 September 2021 (UTC)[reply]
That presupposes that other large runs are somehow more helpful, which they are not. Also, please note that I make an effort to avoid running the bot when there are three other large jobs runnning, while other users do not. Abductive (reasoning) 04:41, 12 September 2021 (UTC)[reply]
Well, that summarizes the dispute, doesn't it? In your own mind, your searches are better than anyone else's, despite all evidence to the contrary, and so you think you deserve more access to the bot than anyone else, and will game the system to take that access because you think you deserve it. —David Eppstein (talk) 05:49, 12 September 2021 (UTC)[reply]
That is incorrect. I have an equal right, and I use the bot judiciously. I do not game the system, and I consider such accusations to be uncivil. Abductive (reasoning) 05:57, 12 September 2021 (UTC)[reply]
"I have an equal right". Not when your use of the bot is disruptive, low-efficiency, and prevents others from using it more effectively. A run on a random non-citation-related maintenance category is not as "helpful" as targeted cleanup runs, especially when it makes running the bot on individual articles crash/extremely slow because of a lack of ressources. Headbomb {t · c · p · b} 13:34, 12 September 2021 (UTC)[reply]
@Abductive: I have posted[12] below at #Bot_still_being_abused_by_Abductive an analysis of the bot's latest 1500 edits.
It shows that:
  1. Your batch requests continue to be poorly chosen;
  2. You continue to game the queueing system by flooding the bot with single-page requests while the batches are being processed, amounting to 73% of single-page requests in that period. I estimate the rate of your single-page requests to be one every 3 minutes over 9 hours.
  3. Your choice of single-page requests has no evident basis, and includes a significant proportion of pages (6 of the sampled 30) which wasted the bot's time by follow closely a prev edit by the bot.
So:
  • Your claim to use the bot judiciously is demonstrably false.
  • Your claim that you do not game the system is demonstrably false.
  • Your labelling of @David Eppstein's complaint as uncivil is bogus, because David's complaint was civilly-worded and is demonstrably true.
I also note your offer above[13] to process batches selected by me. That amounts to another attempt to game the queueing system, in this case by inviting me to effectively use two of the bot's 4 channels. I explained this before, so I am surprised to see you suggesting it again. --BrownHairedGirl (talk) • (contribs) 15:13, 12 September 2021 (UTC)[reply]
@Wbm1058, I'm not super familiar with Citation bot, which Cloud Services resources do you think are being directly provided? If it just allows users to trigger a bot that runs on some pages I think that's fine since that's not directly letting users e.g. make SQL queries or execute bash commands. Please let me know if I missed something. Legoktm (talk) 17:08, 1 September 2021 (UTC)[reply]

Convert to cite biorxiv

Status
new bug
Reported by
Headbomb {t · c · p · b} 06:25, 10 September 2021 (UTC)[reply]
What should happen
[14]
We can't proceed until
Feedback from maintainers


Cleanup when converting to cite arXiv

Status
new bug
Reported by
BrownHairedGirl (talk) • (contribs) 14:13, 10 September 2021 (UTC)[reply]
What happens
bot leaves the url param in place, even tho it is unsupported. This is flagged by {{cite arXiv}} as an error
What should happen
the url param should be removed or commented out
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Mersenne_prime&diff=1043416331&oldid=1042445638
We can't proceed until
Feedback from maintainers


It also didn't add |class= Headbomb {t · c · p · b} 14:24, 10 September 2021 (UTC)[reply]

Bot still being abused by Abductive

@Abductive continues to game the bot's queueing system by make large numbers of single page requests while the bot is processing a batch job requested by Abductive. This has the effect of denying other editors access to the bot, because two of the bot's four channels are in use by Abductive. Both the batch jobs and the single-page requests are poorly-chosen.

See the bot's most recent 1,500 edits, spanning 9 hours. the following observations are all based on that set;

  1. the bot made 223 edits to pages requested by Abductive as batches, from "Category:Articles with specifically marked weasel-worded phrases from month year". (to verify, search the page for Category:Articles with specifically marked weasel-worded phrases from)
  2. The total number of pages scanned by the bot in those batches requested by Abductive is 561 (149 from Category:Articles with specifically marked weasel-worded phrases from April 2020 + 245 from Category:Articles with specifically marked weasel-worded phrases from March 2020 + 167 from Category:Articles with specifically marked weasel-worded phrases from February 2020).
    That is an edit rate of 40% (223/561), which initially seems impressive, but on closer scrutiny is misleading. I checked the most recent 30 edits in this set, and found that 7 of them were either purely-cosmetic edits which make no difference to readers, or trivial edits which changed only the type of quote mark. That leaves only 31% potentially significant edits, which is a low return on the bot's efforts.
  3. Of this set of 1500 edits, the bot made 82 edits from single-page requests (to verify, search the page for | #UCB_toolbar)
  4. 60 of those 82 edits (i.e. 73%) were from from single-page requests by Abductive (to verify, search the page for Abductive | #UCB_toolbar)
  5. Note that the bot's contribs log only shows pages that were edited. Pages which were analysed but not edited still use one of the bot's 4 channels, but are not shown in the contribs list. So the count of single-page requests by Abductive is almost certainly significantly higher than 60. Given that Abductive's concurrent batch requests get about a 30% edit rate, it seems reasonable to assume that Abductive's single-page requests get a similar edit rate, which would put the total number of Abductive's single-page requests at 200 in this 9-hour period, i.e. 22 per hour, or one every 3 minutes.
  6. I examined the 30 most recent bot edits from single-page requests by Abductive. (To verify, search the page for Abductive | #UCB_toolbar):
    • the selection criteria are not evident: none of those 30 pages had recently been edited by Abductive.
    • Six of the 30 pages were bizarre choices:
      1. Junior Senior (TV series) — the 2 most recent edits to the page are by Citation bot, in each case requested by Abductive
      2. Neurometric function — the 2 of the 3 most recent edits to the page are by Citation bot; the intervening human edit was trivial
      3. Neurolaw — the 2 most recent edits to the page are by Citation bot, the latest requested by Abductive
      4. Neuroimmunology — the 1st and 6th most recent edits to the page are by Citation bot. $he intervening edits are 3 bot edits and 1 human dab edit, so nothin to make more work for the bot.
      5. Neural backpropagation — the 2 most recent edits to the page are by Citation bot, the latest requested by Abductive
      6. Neuroanatomy of intimacy — the 2 most recent edits to the page are by Citation bot, the latest requested by Abductive

I am posting this just to put it on record, without any hope of Abductive ceasing to abuse their access to the bot. --BrownHairedGirl (talk) • (contribs) 14:40, 12 September 2021 (UTC)[reply]

The individual articles via UCB_toolbar are fine. It's the batch runs that aren't (e.g. Category:Articles with specifically marked weasel-worded phrases from April 2020). Headbomb {t · c · p · b} 14:44, 12 September 2021 (UTC)[reply]
@Headbomb: Even if both sets were fine on their own (and as above I think that neither is OK), the combination of the two leaves Abductive using 2 of the bot's 4 channels for much of the time. That impedes other editors who want to check a page. --BrownHairedGirl (talk) • (contribs) 15:17, 12 September 2021 (UTC)[reply]
A single article will 'hog' the bot for a few seconds to a few minutes tops, which lets other requests in just fine, unless there's several dozens of them made at the same time. Headbomb {t · c · p · b} 15:28, 12 September 2021 (UTC)[reply]
@Headbomb: Yes, but AFAICS dozens of them have been made at about the same time. I estimate >3 per minute on average over 9 hours, but I doubt they were evenly-spaced over that long a period; much more likely that they were clustered.
I was running some smaller batches today, and when starting new batches I found significant delays (10-30 minutes) in the bot starting to process them, while Abductive's single-page requests were being processed. That's no big deal for my batches, but for someone waiting for the bot to process a page they just edited, a 10-15 minute wait is serious waste of their time. --BrownHairedGirl (talk) • (contribs) 15:37, 12 September 2021 (UTC)[reply]
I've been following this topic for a while and it seems that different editors have different views on when the bot should be used for batch runs. From what I can see there is plenty of documentation explaining how to use the bot but none about when the bot should be used(or how to select the articles/categories for batch runs). Is it worth having a RFC or something similar to hopefully find consensus on when the bot should be used and when it shouldn't, bearing in mind how limited the capacity of the bot is? It might be that this RFC would only apply to batch runs. If consensus is found then surely it would make it easier to take action against those who ignore it. I'm not that clued up about these things so this might not be workable, but I thought it would be worth suggesting. RicDod (talk) 16:56, 12 September 2021 (UTC)[reply]
@RicDod: AFAICS, only Abductive thinks that their use of the bot is appropriate. Everyone else seems willing to apply commonsense, and use the bot efficiently without inconveniencing others.
An RFC seems like a very laborious response to one editor's WP:IDHT issues.
I agree that it would be good to have some guidance on when to use the bot, but we shouldn't need to rush into that to put an end to this disruption. --BrownHairedGirl (talk) • (contribs) 17:16, 12 September 2021 (UTC)[reply]
If one were to look at overall usage of the bot, one would find many, many examples of what looks like people hogging the bot, and of inefficient edits. Abductive (reasoning) 18:25, 12 September 2021 (UTC)[reply]
Nobody else does it with anywhere near the same persistence or intensity as Abductive. --BrownHairedGirl (talk) • (contribs) 19:05, 12 September 2021 (UTC)[reply]
You do. Abductive (reasoning) 19:21, 12 September 2021 (UTC)[reply]
With targeted, high-value runs. Not crapshoot runs. Headbomb {t · c · p · b} 19:23, 12 September 2021 (UTC)[reply]
I have offered, and User:BrownHairedGirl has sometimes accepted, to run batches that she has targeted. The offer still stands. Abductive (reasoning) 19:45, 12 September 2021 (UTC)[reply]
Not true. The only batch I sent you was a randomised test set, to help you test the cleanup rate, of Category:CS1 errors: bare URL after you expressed interest in processing it.
I showed you how to test and make batches from Category:CS1 errors: bare URL, which you selected. That is now complete.
I suggested that you run batches of pages which transclude {{Bare URL inline}}. You did that.
Now have you gone back to your long-term pattern of low-return speculative trawls, and yo have repeatedly denounced the notion of selecting and processing batches. --BrownHairedGirl (talk) • (contribs) 19:53, 12 September 2021 (UTC)[reply]
We have discussed the numbers many times in the last few weeks. So I find it very hard to believe that a competent, good faith editor could claim either that my use of the bot is inefficient, or that I am hogging it in the same way as an editor who accompanies batches with high volume of single-page requests. --BrownHairedGirl (talk) • (contribs) 19:36, 12 September 2021 (UTC)[reply]
Nobody says that your use of the bot is inefficient, the bot is inefficient. We have all experienced that 10 to 20 minute delay, even when only one or two channels is in use. The bot allows people to request categories that interest them. Any category, no matter the topic, will only get about 35% of its articles fixed. I select categories based on a number of metrics, including article readership. And my selections get above average results. Abductive (reasoning) 19:43, 12 September 2021 (UTC)[reply]
As an example, I have just requested the bot run Category:Ranchers from Texas, an interesting topic, at 19:52. Supposedly there are 143 articles in the category. Let's see how the bot performs. Abductive (reasoning) 19:54, 12 September 2021 (UTC)[reply]
Category:Ranchers from Texas just showed up in the feed, 2/143, at 20:18. That's quite an unexplained delay. Abductive (reasoning) 20:25, 12 September 2021 (UTC)[reply]
It looks like the bot made 29 fixes to Category:Ranchers from Texas, or just above 20%. I have started a run on the interesting but randomly selected Category:General topology at 21:25, which started in 2 minutes, a lot quicker than the 26 minute delay for the last one. Let's see how this one goes. Abductive (reasoning) 21:33, 12 September 2021 (UTC)[reply]
@Abductive: so in response to complaints about your repeatedly feeding the bot categories which generate low return, your response is to feed the bot two more categories which also produce low return.
And to top it all, you do so at a time when the bot is already very busy.
Which part of "stop doing this" is unclear to you? --BrownHairedGirl (talk) • (contribs) 21:41, 12 September 2021 (UTC)[reply]
So, it sounds like you consider perfectly legitimate use of the bot on small categories to be "abuse". Any user who runs the bot on a category is therefore "abusing" it. And one channel is being taken up by your run, contributing to making it busy. Is that bad somehow? Abductive (reasoning) 21:47, 12 September 2021 (UTC)[reply]
For the millionth time: running the bot on a few categories containing articles in topic area where you edit is not the same thing as repeatedly selecting large numbers of big low-return categories just so that you can feed lots of pages to the bot.
Please stop pretending that the distinction is unclear to to you.
And please stop pretending that you dont understand the distinction between high-return batches and low-return batches. --BrownHairedGirl (talk) • (contribs) 21:58, 12 September 2021 (UTC)[reply]
I just checked the bot's handling of my current job of 1,064 pages from 21st-century deaths, part 4 of 4.
The bot's contribs list shows 440 edits out of the 832 pages processed so far. That's a 58% edit rate.
Since by your own admission, your categories don't exceed 35%, they are wasting the bot's time. --BrownHairedGirl (talk) • (contribs) 20:08, 12 September 2021 (UTC)[reply]
Bizarre.
You just said above that my use of the bot was inefficient, but now you deny that.
The batches I select get over 50% of pages edited, and Headbomb's batches get even more. So there is no need to waste the bot's time on low-return sets. --BrownHairedGirl (talk) • (contribs) 19:57, 12 September 2021 (UTC)[reply]
I did not say you were inefficient. I said you use the bot more than I do. Abductive (reasoning) 20:00, 12 September 2021 (UTC)[reply]

"Above average results" not at 35% on random categories unrelated to a topic or citation-related cleanup category. My runs get 85-90% edit rates. Headbomb {t · c · p · b} 19:56, 12 September 2021 (UTC)[reply]

You're doing a great job, but your runs aren't category runs. Abductive (reasoning) 20:13, 12 September 2021 (UTC)[reply]
See User talk:BrownHairedGirl#Since you're curious. I do plenty of category runs. I just pick my categories sanely, instead of unrelated categories like Category:Articles with specifically marked weasel-worded phrases from April 2020. Headbomb {t · c · p · b} 20:20, 12 September 2021 (UTC)[reply]
@Headbomb: note how Abductive again completely misses the point. There is no requirement for Abductive to use the bot at all, or to use it on categories.
The problem is remains that Abductive keeps on feeding the bot batches with low return. They should stop processing such batches. If they can find a way of selecting batches with higher return, then fine, run those batches ... but just stop running these low-return batches. No batches from abductive would be better than this lot. --BrownHairedGirl (talk) • (contribs) 20:46, 12 September 2021 (UTC)[reply]
You have an idea of how the bot should be used. Your idea is wrong-headed, and you attempt to impose it on other editors. I am demonstrating that category runs, an allowed function of the bot, have a low rate of return, and that's not anybody's fault. I also refrain from using the bot when it is busy. Abductive (reasoning) 21:55, 12 September 2021 (UTC)[reply]
You are demonstrating that category runs, when poorly chosen, have a low rate of return, and that's the user's fault. — JohnFromPinckney (talk / edits) 22:03, 12 September 2021 (UTC)[reply]
The bot completed the run on randomly selected Category:General topology, making fixes to 52 out of 174 articles, just under 30%. Any time a user runs a category, we can expect this, and it is not a problem; the bot is functioning as intended. My choices exceed this level, and have the added benefit of being to high-visibility articles. I also reiterate that I refrain from running the bot when it appears busy. Abductive (reasoning) 22:28, 12 September 2021 (UTC)[reply]
Sigh. The problem is that a 30% edit rate is half what I achieve with my targeted edits, and 1/3 of what Headbomb achieves. So it's a highly-inefficient use of the bot's time.
It is true that any time a user runs a category, we can expect this ... so stop using categories for high-volume general cleanup.
You know the problem, and you know the solution. --BrownHairedGirl (talk) • (contribs) 23:19, 12 September 2021 (UTC)[reply]
The random sample category runs above show that my method of choosing categories gets superior results. So they are the opposite of speculative trawling. Furthermore, your idea of proper bot usage, if implemented, would preclude all category runs by all users. Abductive (reasoning) 23:30, 12 September 2021 (UTC)[reply]
More denialism.
Your method of choosing categories frequently leads to edit rates in the low teens: I have documented many such cases above, so your denials are untruths which must be known to you to be untrue.
Even the best cases you mention get edit rates of ~35%, which is about half of what i get and about a third of what Headbomb gets. his has already been demonstrated to you with documented example, so your denial is another untruth which must be known to you to be untrue.
No, my idea would not preclude all category runs by all users. Category batches are an inefficient but handy way of cleaning up a set of articles concentrating topics which an editor works on. However, that inefficiency makes them a very bad way of making a topic-blind selection articles in need of cleanup, and the problem remains that you are repeatedly using categories for a task to which they are ill-suited. --BrownHairedGirl (talk) • (contribs) 11:43, 13 September 2021 (UTC)[reply]
  • Comment I'd just like to be able to use the bot for single pages, but more often than not I get a failed message after about 10 minutes. I know other users have given up using the bot because of these problems. --John B123 (talk) 20:34, 12 September 2021 (UTC)[reply]
    That delay exactly why I have taken the time yet again to document how Abductive's misuse of the bot is blocking such single uses.
    As Headbomb notes, the article will still be processed eventually. But you shouldn't have to wait so long for that to happen. --BrownHairedGirl (talk) • (contribs) 20:40, 12 September 2021 (UTC)[reply]
@John B123: btw, if you just click on 'expand citations' in your sidebar, the bot might look like it crashes, but it will still get to the article eventually. Might take 30 minutes, might take an hour, but it does get processed. Headbomb {t · c · p · b} 20:38, 12 September 2021 (UTC)[reply]
@Headbomb: Thanks, I'll give that a try. I've been using the 'Citations' button under the edit box which gives a pop-up window saying "Error: Citations request failed" after about 10 minutes if it hasn't run by then. --John B123 (talk) 20:58, 12 September 2021 (UTC)[reply]
@Headbomb: Tried that and got a browser error page with "citations.toolforge.org unexpectedly closed the connection." after about 10 minutes. --John B123 (talk) 21:02, 12 September 2021 (UTC)[reply]
@John B123: which page? Headbomb {t · c · p · b} 21:07, 12 September 2021 (UTC)[reply]
@Headbomb: It was either 8th Canadian Hussars (Princess Louise's) or 2/1st Pioneer Battalion (Australia). --John B123 (talk) 21:16, 12 September 2021 (UTC)[reply]
@John B123: ran on both, without issue. 8th had an edit, 2/1st had nothing to do. Headbomb {t · c · p · b} 21:48, 12 September 2021 (UTC)[reply]
@Headbomb: Ok, thanks for having a look. --John B123 (talk) 21:52, 12 September 2021 (UTC)[reply]

Edit summary omission

Status
 Fixed took a while to figure out, but it is working right now
Reported by
BrownHairedGirl (talk) • (contribs) 16:29, 12 September 2021 (UTC)[reply]
What happens
bot makes two changes: adds a date to one ref, fills in another bare ref. But the edit summary is just Add: date., with no mention the bare URL being converted to CS1/CS2
What should happen
Edit summary should mention both changes.
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Charlie_Russell_(naturalist)&diff=prev&oldid=1043865882
Replication instructions
Is it a factor that the bare url was enclosed in square brackets?
We can't proceed until
Feedback from maintainers


Very odd. Not sure how that could have happened. AManWithNoPlan (talk) 20:29, 12 September 2021 (UTC)[reply]

Here's another case where the bot omitted from the edit summary any mention of a bare URL which it elegantly converted to CS1/CS2:[15]. --BrownHairedGirl (talk) • (contribs) 11:33, 13 September 2021 (UTC)[reply]

MIAR weirdness

Status
 Fixed with adding miar.ub.edu/issn to the blacklist
Reported by
Headbomb {t · c · p · b} 17:45, 12 September 2021 (UTC)[reply]
What happens
[16]
What should happen
Leave as cite web, don't fill journal/doi
We can't proceed until
Feedback from maintainers