> Differential privacy makes this trade-off explicit, and thus impossible to ignore.
I think he has it backwards here.
Techniques like differential privacy hide the fact that a trade-off exists, except for a small cadre of experts who live and breathe this stuff.
I don’t know enough to defend this decision, but it strikes me that if there is a real trade-off, not having access to these techniques will force people other than statisticians to confront the trade-off.
If data about the public is so dangerous that we must disguise the results, then perhaps its data we shouldn’t be collecting in the first place.
Nope private data about people is published unintentionally regularly, Netflix history and medical records being some of the notable examples.
People are bad at making the tradeoff because they consistently underestimate the amount of information that is leaked. Forcing them to leak safe amounts of information is the right way.
Not sharing or collecting the data could in some cases be better but there is clear value in this data so the optimal amount to store and make public is not 0.
I think the real killer is that every knows their data has been leaked six times over, and yet nothing bad has come of it for 99% of people.
If there was an apocalyptic privacy breach that lead to 40% of the population losing their savings, people would be smashing their smart TVs in the streets a day later.
But alas, nothing bad actually manifests (besides the suspicious ads that know you really like Tide detergent).
imho, one big reason why Data Science as a big org lost clout in tech companies was a tendency to treat DS as gatekeepers of data. Outsourcing the responsibility of stat thinking gave many DS a weird power trip; when one dude gets to decide the trade-offs first without anyone around them needing to understand properly.
> If data about the public is so dangerous that we must disguise the results, then perhaps its data we shouldn’t be collecting in the first place.
By this logic no one should ever collect your address for any reason ever. How do we function as a society if we can’t ever give PII in any context? Anonymization/security is critical and makes a lot of critical functions possible.
How could you receive your mail in a world where we never give out/collect info that is potentially hazardous?
Name, address, and phone number served plenty of critical functions when they were published in the White Pages. Cell phones not being listed there was kind of an accident of history. It was common to call a listed landline and be given or forwarded to a cell number. Only after most people stopped having landlines altogether did a phone number come to be considered sensitive information (unless you were a celebrity or something).
Ironically Facebook is responsible for much of this, as friending someone on Facebook became a lower stakes, less intimate alternative to exchanging phone numbers.
It would entirely be possible to limit the scope of things, by making sure the company that has your address (UPS or USPS, say) never has the other information. Each business would hand off a zero-knowledge identifier to you that you'd give to the others: Amazon would only know that the payment identifier they gave to you was fulfilled at VISA somehow, and then hand the package off to UPS with an identifier that they would never see again.
An argument about whether or not to deploy differential privacy on large statistical databases has no bearing whatsoever on whether or not you give your address to have a package delivered. If you want the package delivered, you have to give your address.
On the other hand, it’s not at all clear that people should have to involuntarily, my force of law, offer up all sorts of personal details about their lives. And questions about whether the use of differential privacy can or should justify the collection of sensitive information are quite valid.
The census is justified by the idea that it will help us plan for the future. But the track record of central planning is poor to disastrous.
A small example: in theory population changes could inform land use decisions. In practice however, the ability of population to increase is softly capped by the amount of housing that exists, or will exist. If you restrict or frustrate housing, you will also restrict people from living where they want to live. Then the planners will point to the census data and tell you that nobody wants to live there and therefore there’s no need for change.
Ironically, if you wanted to measure where people want to live in order to get information for planning purposes, the number is right there and doesn’t require any personal data collection at all - it’s the price. (in this example $ per square foot of floor space). But in my experience people who like central planning don’t believe in prices so they ignore that and they look at their reams of personal data and they conclude that all is well in the world. It is hard for me to be sympathetic if one day folks like that had
have less data to look at.
> If data about the public is so dangerous that we must disguise the results, then perhaps its data we shouldn’t be collecting in the first place.
We agree that doxxing is dangerous online yes? Your point about the white pages is exactly what I’m talking about. A piece of data isn’t inherently dangerous or not dangerous. It’s about context and ease of access by actors with various intentions.
Sure, it’s impossible to misconstrue that the statue of that size has or had value to someone. But that still doesn’t mean it will have value to you.
Someone who’s never seen it before, who has no exposure to the cultures that produced it or the discourse around it, can be impressed by the size, but otherwise not care.
Worse, if such a person is actively hostile to the cultures that produce it, then learning that it is valuable to that culture will lead them to assign negative value to it.
> how a normal person can become a maintainer though.
Is the goal to produce high-quality software, or is the goal to produce an apprenticeship scheme for developers who are interested in the project but not so interested that they are willing to write an email to introduce themself or otherwise engage in normal human social interactions?
Normal people will still be able to get involved if they want to, just like normal people can get jobs. You learn about the organization you’re interested in joining, you try to meet some people and introduce yourself, you gain trust and prove your worth. It can be true that a pull request once embodied some of these tasks, but it is not true that being unable to submit a request means that these tasks are no longer possible to perform. It just means you’ll have to do them differently, just like the rest of humanity does when they want to get involved in an organization.
>Normal people will still be able to get involved if they want to,
there isn't much evidence that this is happening. When you eyeball the average age of maintainers on mailing lists these days for prominent open source projects like the linux kernel, it's been steadily creeping up, to somewhere in the 50s now I'd guess. There's a complete dearth of people in their 20s or even 30s in particular in positions of responsibility, there's no next generations of leaders.
That's fatal in the long run. You need to have an apprenticeship system or something like a vocational pipeline to engage people in a structured way so that you can produce talent and also be objective and systematic. Something like a guild system you have in the DACH region where companies survive centuries, and that's not because random people write mails, it's because there's a industry wide support system and training process.
Yes but that predates AI & projects rejecting pull requests, so I would argue this is a separate unrelated phenomenon. If anything, the fact that despite accepting PRs, most projects have little new blood, means that PRs were rarely a significant pathway for future maintainers.
Yes. Open source existed and thrived before GitHub, before git, and before anyone had ever used the words “pull request“.
It was different, to be sure, but it was not worse. We are living through a transition, but people do that all the time and we adjust our behaviour and we find new equilibriums. We will do that with open source too, and if it ends up looking more like open source in the 80s or 90s, it’s gonna be fine.
Maybe some people who got really good at gaming their Github reputation are going to lose out, but that was never the point. Anyone who likes this kind of work and wants to get involved will find a way.
That assumes there is no value whatsoever in doing your own chores. If you want to value time w/friends & family over chores, fair enough, but doing chores is definitely a better & more valuable use of time than zoning out tik tok or gambling etc.
> In order for that to change, the market has to increase in size by appealing to a more casual audience, or existing gamers have to pay more.
The fun part of all this is that when union demands start forcing the industry in the opposite direction - higher cost, higher prices, smaller market. In a sane world, we would connect this, but in this world, we will just blame management. The union will forever have an invincible PR shield no matter how crazy the demand.
While I fundamentally agree with the concern about unions raising costs in a market where most titles cannot absorb them, GTA/Rockstar definitely can. Especially since the union is fighting for basic quality of life like no crunch instead of (for now at least) increased pay. I am generally not prounion but crunch -- especially at studios that are guaranteed to be profitable (GTA) -- needs to be curbed.
In what world are unions never criticised? I'm in the UK and they are often reviled in the press and among people who don't work in a unionised sector. America has an even stronger tradition of anti-union feeling (maybe partly due to historic links between unions and organised crime but also because the US has often had a stronger collectivisation than most European countries - consider that the political centre in the US would be considered into right wing in most Western countries on most issues)
I think he has it backwards here.
Techniques like differential privacy hide the fact that a trade-off exists, except for a small cadre of experts who live and breathe this stuff.
I don’t know enough to defend this decision, but it strikes me that if there is a real trade-off, not having access to these techniques will force people other than statisticians to confront the trade-off.
If data about the public is so dangerous that we must disguise the results, then perhaps its data we shouldn’t be collecting in the first place.
reply