We automatically add links to U.S. Code and other citations. In this case Congress.gov is missing rich formatting which we have (I'm not sure why they are missing it for this bill, normally they have it). GovTrack also allows making diff-like comparisons between bill versions and between bills (for example, you can see the last-minute changes made ahead of the vote on this bill).
Source code is available on GitHub if anyone wants to try making GovTrack better, although it's quite complicated because Congressional information is complicated and there's no real money behind this: https://github.com/govtrack/govtrack.us-web/
If anyone has particular thoughts on what would be helpful when viewing bill text --- within the realm of the information that is actually freely available --- I am all ears.
I would love a Genius.com / annotation layer on top of these bills too. Just a dream I'm sharing out loud for no particular reason :) love govtrack in general otherwise!
It need not be shared , think more like a public notion/ share point document with comments visible . I.e experts(users) can create their own individual annotated versions and share with others .
As long as there is no single version of the annotations , moderation is not needed
Side-note, if anyone wants to really dig into all the data available about bills (including votes, attachments, etc.), this is a great place to start: https://github.com/unitedstates/congress
There's excellent documentation on the formats and how to access all the data.
What I find most frustrating are the bills written as prose-diffs themselves: "In some entirely different piece of law, Foo shall be inserted after Bar, with an overall effect and purpose which will not be described here."
See https://www.govinfo.gov/bulkdata/BILLS/resources. Specifically the billres.xsl and associated stylesheets. You can use those with the Saxon XSLT processor to transform the XML files into a HTML view similar to what the PDFs look like.
Should be at fun little XML parser to write, converting the thing to HTML.
Except that it's a government thing so the parser's probably not going to be little. :)
Edit: The thing's basically XHTML without any kind of header. UTF-8 encoding, it looks like. So a conversion tool would just need to wrap it up and add styling.
Edit: Despite hints that it's XHTML, it's not valid XHTML.
Some friends just made this: https://www.congressionalrag.com/ - they need help from anyone interested, especially around pulling in more data sources.
There should at least be an AI sidebar on congress.gov. I think Americans would learn a whole lot with such a thing, but who wants to foot the bill for this one.
It's easy to imagine a non-technical user asking the AI a question and implicitly trusting the response as factual, without understanding anything about hallucinations.
Versus what? An intractable archive of unreadable documents? At the very least they'll get tractable information, which humans will always use on social media to make a point, which will then get fact checked. I prefer that loop. Right now the information is hidden in coffers and never gets taken for a loop.
A root cause analysis would probably suggest the question - Why are our representatives passing intractable, unreadable documents as law and how can we prevent them from doing that? Or more generally, what changes can be made to our government institutions to improve clarity in communicating actions and decisions to votes?
Yeah, it's naive thinking, and I'm well aware the obfuscation is sometimes the point.
But I digress... My main takeaway here is that we should be considerate of what problems adding AI to the equation may cause. I'm old enough to have seen how "the new big thing" ends up getting applied to every problem space, without really thinking about the consequences.
Why are our representatives passing intractable, unreadable documents
Best guess?
1) These are actual laws so they carry all the legal thoroughness
2) Like managers, they don't write the actual code (they don't do the actual writing of the bills). So managers don't really care just how awful the code can be (or in this case, just how intractable the bills are)
3) No one is code reviewing (the public is to uneducated to even do so)
This leads to these things being drafted in the dark of night and passed in the dark of night. I'm open to AI in this case simply to even begin having insight.
Paste the entire thing into the LLM! Maybe people can stop relying on unreliable partisan sources to interpret bills if they have tools to grok the dense weird language in them themselves. I say this even though I was embarrassed yesterday when the LLM misinterpreted something and I posted it - read the reference text behind any summary :/
Maybe even today an LLM is better than hearing about what the bill contains from social media reposts. The more the actual text is accessible the better (and accessible is not just technically accessible, but also understandable to the reader).
Government Publishing Office and Library of Congress provides XML formatted bills and all their amendments and a feed of all changes to every bill.
Oh and on the topic of party politics, Bill Clinton was the one who had them put things online in the first place with the GPO Electronic Information Access Enhancement Act, and Barack Obama and the Democrats expanded it via American Recovery and Reinvestment Act of 2009 - not the do-nothing Republicans.
Congress.gov, originally THOMAS.gov, was a product of the Republican Contract with America take-over of Congress in the mid 1990s. Republicans in Congress, including Rep. Issa for example, were helpful in expanding the information that Congress publishes publicly. In the last 15 years, efforts to make Congress publish more and better-structured information have been relatively bipartisan and, mostly, led by nonpolitical staff. I would not describe Democrats as having been the ones to have exclusively created the access to congressional information that we have today, although Democrats in recent years have led on government transparency and accountability issues generally, beyond the Legislative Branch.
Changes that have required legislation have, as far as I'm aware, not really been influenced by the President, other than being signed into law, since they are Legislative Branch concerns and not Executive Branch concerns.
OP may have been unlucky on the timing. The site isn't usually down. Here's the link to the text of H.R. 1 on GovTrack: https://www.govtrack.us/congress/bills/119/hr1/text
We automatically add links to U.S. Code and other citations. In this case Congress.gov is missing rich formatting which we have (I'm not sure why they are missing it for this bill, normally they have it). GovTrack also allows making diff-like comparisons between bill versions and between bills (for example, you can see the last-minute changes made ahead of the vote on this bill).
Source code is available on GitHub if anyone wants to try making GovTrack better, although it's quite complicated because Congressional information is complicated and there's no real money behind this: https://github.com/govtrack/govtrack.us-web/
If anyone has particular thoughts on what would be helpful when viewing bill text --- within the realm of the information that is actually freely available --- I am all ears.
It need not be shared , think more like a public notion/ share point document with comments visible . I.e experts(users) can create their own individual annotated versions and share with others .
As long as there is no single version of the annotations , moderation is not needed
There's excellent documentation on the formats and how to access all the data.
See https://congress.dev/bill/119/House/1/EH
Note: it could be worth checking the issues at https://github.com/usgpo/bulk-data/issues as some of those contain fixes and formatting improvements.
Except that it's a government thing so the parser's probably not going to be little. :)
Edit: The thing's basically XHTML without any kind of header. UTF-8 encoding, it looks like. So a conversion tool would just need to wrap it up and add styling.
Edit: Despite hints that it's XHTML, it's not valid XHTML.
Edit: Stick this at the top of the file:
--------------------- 8< ---------------------
<!DOCTYPE html>
<html>
<head>
</head>--------------------- 8< ---------------------
And add this to the bottom of the file:
--------------------- 8< ---------------------
</html>
--------------------- 8< ---------------------
I'll leave it as an exercise to the reader to write a script to do that. Automatically extracting the bill title should be Fun.
https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/...
seems to be broken on the "Big Beautiful Bill" right now though :(, I'm taking a a look to see what's going on
https://github.com/saihaj/DOGE-AI
https://www.congress.gov/119/bills/hr1/generated/BILLS-119hr...
Yeah, it's naive thinking, and I'm well aware the obfuscation is sometimes the point.
But I digress... My main takeaway here is that we should be considerate of what problems adding AI to the equation may cause. I'm old enough to have seen how "the new big thing" ends up getting applied to every problem space, without really thinking about the consequences.
Best guess?
1) These are actual laws so they carry all the legal thoroughness
2) Like managers, they don't write the actual code (they don't do the actual writing of the bills). So managers don't really care just how awful the code can be (or in this case, just how intractable the bills are)
3) No one is code reviewing (the public is to uneducated to even do so)
This leads to these things being drafted in the dark of night and passed in the dark of night. I'm open to AI in this case simply to even begin having insight.
Oh and on the topic of party politics, Bill Clinton was the one who had them put things online in the first place with the GPO Electronic Information Access Enhancement Act, and Barack Obama and the Democrats expanded it via American Recovery and Reinvestment Act of 2009 - not the do-nothing Republicans.
Congress.gov, originally THOMAS.gov, was a product of the Republican Contract with America take-over of Congress in the mid 1990s. Republicans in Congress, including Rep. Issa for example, were helpful in expanding the information that Congress publishes publicly. In the last 15 years, efforts to make Congress publish more and better-structured information have been relatively bipartisan and, mostly, led by nonpolitical staff. I would not describe Democrats as having been the ones to have exclusively created the access to congressional information that we have today, although Democrats in recent years have led on government transparency and accountability issues generally, beyond the Legislative Branch.
Changes that have required legislation have, as far as I'm aware, not really been influenced by the President, other than being signed into law, since they are Legislative Branch concerns and not Executive Branch concerns.