Refactor: Modularize monolithic codebase into clean architecture (v0.60) #72
No reviewers
Labels
No labels
Backlog Status
Needs Review
Backlog Status
Ready
chore
documentation
points
01
points
02
points
03
points
05
points
08
points
13
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Sprint Status
Blocked
Sprint Status
Done
Sprint Status
In Progress
Sprint Status
Review
Sprint Status
To Do
Technical Debt
Work Item
Bug
Work Item
Epic
Work Item
Spike
Work Item
Task
Work Item
User Story
No milestone
No project
4 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
quality/fedora-easy-karma!72
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "refactor/easy-karma"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Refactored the entire application from a single 1065-line monolithic file
into a well-organized modular package structure with clear separation of
concerns. This major architectural improvement maintains 100% functional
compatibility while significantly improving code maintainability.
New modular structure (fedora_easy_karma/):
Key improvements:
Additional changes:
All original features, command-line interface, and user experience remain
unchanged. Configuration files and state files are backward compatible.
Fixes: #69
Co-authored: Claude-2.5
5ca3b51ac1to1d496f12d6(WIP) Refactor: Modularize monolithic codebase into clean architecture (v0.60)to Refactor: Modularize monolithic codebase into clean architecture (v0.60)Can you please also generate
requirements.txtand verify that it works in a clean venv? I'm having issues to populate all the dependencies.Let's add
Fixes: #69to the commit message.I would recommend shortening the module names, e.g. typing
bodhi_communicatorfeels excessive when it can be justbodhi.I'll provide a more detailed review once I can run it in my venv.
Refactor: Modularize monolithic codebase into clean architecture (v0.60)to WIP: Refactor: Modularize monolithic codebase into clean architecture (v0.60)WIP: Refactor: Modularize monolithic codebase into clean architecture (v0.60)to Refactor: Modularize monolithic codebase into clean architecture (v0.60)I'll try and look at this too, but in general I am suspicious of any 'refactoring' that leaves the project 2x the size it was before.
Clearly fedora-easy-karma-old.py should not exist. This is a git repository. We don't need old versions of things lying around the current state of the repository.
The presence of CLAUDE.md effectively now ties the project to LLM generation; if we did any significant manual changes we would also have to update CLAUDE.md and hope we did it in a way Claude (or any other LLM being used to work on the code base) likes. I'm not sure I love this. This kind of project design documentation is useful in itself for large projects, but should be formatted in a neutral way rather than one tied to a specific LLM. For a project of this size I'm not sure its value outweighs the expense of maintaining it.
I think it'd be much easier to be confident in a refactor if the original implementation had tests, and the refactor passed the test suite with as few modifications as possible. A rewrite of something from 1000 lines to 2000 lines with no tests on either side is an inherently risky operation.
@adamwill wrote in #72 (comment):
I don't mind it for the review, it makes comparisons easier. But yes, we should delete it at the end of review. (Or now, either way works).
Right. It's probably useful for the duration of some big feature development, but we probably don't want to commit it. When enough changes accumulate, it would only confuse future AI.
That would of course be great, but that's not going to happen. This refactor is a step forward for having some tests. The previous code was a spaghetti monster, writing tests for it would be very hard and would likely necessitate changes exactly as we do now. And the test suite would likely be completely changed during the rewrite. The idea is that we need to split this into reasonable pieces and standalone objects and methods, and only then it makes sense to cover it with tests or do some larger future work.
@adamwill wrote in #72 (comment):
I would like to provide some explanations:
I adopted Fedora Easy Karma as a learning project because it was not that complicated and it has been in stagnation for a certain period of time. It did not have any issues that would originate from the community users (apart from us), which suggested that most of the users were already satisfied with its functionality. That does not apply any pressure to quick development or critical fixes and leaves time to explore and try out things that I could learn from. The learning process also includes to try out LLMs and see if they are of any use (AI goal and such). Since I dived into the project, I have been able to fix several issues and I believe the project is now in a better state than it was before.
Anything, I have done so far, I have done with best intentions, but it does not have to be "the best" solution ever. I take the project as a process, as a constant development and improvement in terms of Kaizen. I do not dare to say that I am able to hit the bull's eye in the first attempt. I also have not that many experience with programming anything else than one-py-file applications, but I still think that this step makes the application "better than it was before". That's why I decided to invite other people to have their say.
I know that you prefer compactness over wordiness, but that could make the code less readable for people who are not that good in reading idioms. Is it better to spend time trying to understand a couple of dense statements, or to spend time (maybe less) to go over more statements which a clear and concise? I still advocate for the second. And just because it is easier for me, but not everybody outthere is as an avid code reader as you are. Also, do not just calculate lines. Many have been introduced because the model added quite a number of new methods (which replaced the bulky one method application), the method info strings for each of them, and added the overall description of how the project works in
CLAUDE.md. I also let another model help me with the post refactoring review and where it suggested the code was "dangerous" I try to add checks to make it "less dangerous". which also added a couple of extra lines. I wonder, if I deleted the old py file, would the number of the lines drop in the commit overview?I plan to delete it before we decide to merge this.
The presence of CLAUDE.md is not needed and the file can be deleted, renamed, or turned into a more general RAG file if we decide so. My intention is not to turn this project into an AI generated one, I used the model to speed up the conversion into a new form. I would like to have a clean project that is easy to maintain manually and which is also easy to add new features and scale it up.
The old file is messy, the most of the application's functionality was kept in an init function, which (even as a baby pythonista) I see problematic, if not entirely ugly. It was very complicated to add new functions and nearly impossible to turn the output into something better than just a plain CLI.
I agree. I do not have much experience with writing application tests and my precedesors did not seem to have the need to write them either. I plan to have a test suite later. This new form, IMHO, could offer better fundaments for writing tests than before.
So, I (as a baby pythonista) believe that this PR is an improvement to be worth doing for the future (although not perfect or final). If it is, let's make sure it does not break anything, use it, create new issues to improve, simplify, or shorten stuff, and I can work on this gradually over time to make it better on a step-by-step basis. Or, if it is totally wrong, then let's discard it and I will have a case study of a failed LLM generation attempt and I can start doing it manually (which I would be happy to - but it takes much more time and maybe does not fit into how we newly organize our work), or try with some new models in the future and maybe they will be able to absolutely nail it on the first try with no need for further changes.
To be clear, I'm not saying that as a rejection of the change. It's just more of a 'code smell' - something that should be explained at least, and considered. LLMs tend to be wordy and expansive - are all the things it added actually necessary? Stuff like that. If we go through and it turns out that yes, we do need all the new LoC, that's fine. I just want to make sure it's checked.
I tried this script by AI, which found some traceback:
❯ ./fedora-easy-karma.py --product "karma"
INFO: Logging into Bodhi
INFO: Getting list of installed packages...
INFO: Waiting for oraculum instance to return list of packages in updates-testing...
INFO: Waiting for Bodhi for a list of packages in updates-testing karma43...
WARNING: Error while querying Bodhi: {"status": "error", "errors": [{"location": "querystring", "name": "releases", "description": "Invalid releases specified: karma43"}]}
Traceback (most recent call last):
File "/home/psklenar/git/fedora-easy-karma/fedora-easy-karma-FORGE/./fedora-easy-karma.py", line 43, in
raise SystemExit(main())
~~~~^^
File "/home/psklenar/git/fedora-easy-karma/fedora-easy-karma-FORGE/./fedora-easy-karma.py", line 29, in main
return app.run()
~~~~~~~^^
File "/home/psklenar/git/fedora-easy-karma/fedora-easy-karma-FORGE/fedora_easy_karma/application.py", line 145, in run
testing_updates = self._get_testing_updates(release)
File "/home/psklenar/git/fedora-easy-karma/fedora-easy-karma-FORGE/fedora_easy_karma/application.py", line 189, in _get_testing_updates
testing_updates = self.bodhi.query_updates(
release, pending=False,
rows_per_page=self.options.bodhi_request_limit)
File "/home/psklenar/git/fedora-easy-karma/fedora-easy-karma-FORGE/fedora_easy_karma/bodhi_communicator.py", line 91, in query_updates
result = self.client.query(**query_args)
File "/usr/lib/python3.14/site-packages/bodhi/client/bindings.py", line 130, in wrapper
result = method(*args, **kwargs)
File "/usr/lib/python3.14/site-packages/bodhi/client/bindings.py", line 482, in query
return self.send_request('updates/', verb='GET', params=kwargs)
~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.14/site-packages/bodhi/client/bindings.py", line 239, in send_request
raise BodhiClientException(response.text)
bodhi.client.bindings.BodhiClientException: {"status": "error", "errors": [{"location": "querystring", "name": "releases", "description": "Invalid releases specified: karma43"}]}
[!] POTENTIAL BUG: Script accepted karma '99' which is outside the -1 to 1 range.
like
23164856bato042d32349c@psklenar wrote in #72 (comment):
This has been fixed in #
b74fb03792.@psklenar wrote in #72 (comment):
As the UI suggests, typing anything different from
-1,0,1, oriwill skip the voting process. It behaves as the older version of F-E-K. I do not think we need to address this in this PR.b74fb03792tocc16636928I've added
Fixes: #69to the commit message and moved all labels and sprint planning to #69 instead of here. Dogit pull; git reset --hard origin/refactor/easy-karmato receive the changed commits.Refactor: Modularize monolithic codebase into clean architecture (v0.60)to WIP: Refactor: Modularize monolithic codebase into clean architecture (v0.60)I've started looking into this, and will add some commits where it makes sense to me. Since this is all generated code, Lukas will hopefully not have any hard feelings about it 😉
I've done only cosmetic fixups so far, but I had the idea to let AI compare functional (not structural) differences between the old and new code. And there are many of them, and some of them contain genuine bugs. The problem is, it is very hard to compare the code, because the code was not just shuffled around into different modules, split into smaller methods, etc. It was very often also rewritten. I believe that was a mistake, because then this is not just "let's make it more readable" exercise, but it's a full validation between the old code and the new one. Everything is changed! What we should've done was to ask the agent to restructure the code, but make absolute minimal functional changes. We could of course dump the existing PR and generate it from scratch, but since I've already invested time into this review, it feels like wasted time to me, so I'm trying to see if we can still push this through.
Anyway, here are a few examples of changes that didn't need to happen, and cause regressions:
1. Bugzilla URL is broken (typo)
Old (fedora-easy-karma-old.py line 810): https://bugzilla.redhat.com/12345 (works via redirect)
New (output.py line 242): https://bugzilla.redhat.com/show_bug-cgi?id=12345 (dash instead of dot)
Old:
New:
The new URL uses show_bug-cgi (hyphen) instead of show_bug.cgi (dot). This will produce invalid Bugzilla URLs.
2. Rawhide detection raises unhandled exception
Old (line 588-592) prints and calls sys.exit(1). New (config.py line 290) raises a bare ValueError.
The entry point (fedora-easy-karma.py) only catches errors.UsageError and KeyboardInterrupt, so this ValueError will propagate as an unhandled exception with a full traceback instead of a clean exit message.
This is not fully relevant to the detected problem, but a nice example of completely needless changes. It even rephrased error strings!
Old:
New:
It is very hard to compare stuff when everything changes including strings.
3. send_comment retry count is off by one
Old (line 713): for retry in range(0, self.options.retries + 1) = retries+1 attempts (4 with default retries=3).
New (bodhi.py line 149): while retries: and retries = retries - 1 = retries attempts (3 with default retries=3).
New code makes one fewer attempt than old code with the same --retries value.
This is also a perfect example of a method that could've stayed almost 100% intact, but was rewritten instead.
Old:
New:
If it wasn't rewritten, the bug wouldn't have been introduced.
These are just a few examples (I have 27 detected problems to go through, and that's just what my AI found). I'll continue with the review tomorrow. But I'm not sure if fixing all of these is the easiest way forward...
This feels like the sunk cost fallacy to me. If you think we did this wrong, and I think you made a good case for that, we should throw it away and do it again. It doesn't make sense to keep dumping more time into a review of a bad approach just because we already dumped some time into it. I think we should, indeed, start over with a new prompt.
Instead of starting from scratch, I used AI to redo the current code, keep the structure, but make it functionally and code-wise as close as possible (inside individual functions/methods) to the original source. It's a large diff, but I checked a large part of it and it really is the result that I imagined - code is just moved around, but mostly kept the same.
Actually, some nice changes are now reverted, e.g. getting rid of
munchdependency. But I'd like to start just with the restructuring step right now, without any arbitrary code improvements. After this is reviewed and pushed, we can create further changes individually, like removing munch, using the logging library, etc etc. But those would be individual changes, and much easier to review, because the code wouldn't be all moved around. Sounds good?@lruzicka Can you please review the current PR? If it looks OK to you (and if you find no runtime issue), we would squash this whole thing, push it, and then we can start making more improvements.
WIP: Refactor: Modularize monolithic codebase into clean architecture (v0.60)to Refactor: Modularize monolithic codebase into clean architecture (v0.60)This looks good to me. Let's use this version and then continue on a step-by-step basis.