Things that I think should be noted and filed in an orderly fashion, but which I lack time to address right now. Content will change incessantly.
Notes
I need to reclassify the bio computing links; that section has become confusing and there are too many nice ideas there not clearly distinguished.
To subscribe to
All done
To post to team chat
OpenAI: How Do They Do It? Lessons from Jiu-Jitsu Innovation and for S&T Policy
Experiment metascience grant
Via Louise Ord, virtuosic algorithmic line art in MetaPost
FAQ, Frequently Asked Questions about HandWiki Encyclopedia of science and computing
Matthew Feeney, Markets in fact-checking
Étienne Fortier-Dubois, The elements of scientific style
Saloni Dattani, Real peer review has never been tried
pyg-team/pytorch_geometric: Graph Neural Network Library for PyTorch
Eight Graphs That Explain Software Engineering Salaries in 2023
Typst: Compose papers faster reimagines LaTeX with modern tech and workflow. I think they are doomed because while they solve the horrible, awful, nastybad misfeatures of most of the LaTeX workflow, they also do not support the part of LaTeX which people (rather than journals) need, which is to say, the mathematical markup.
LASER-UMASS/Themis: Themis™ is a software fairness tester.git
Brittany Johnson-Matthews: Causal testing: understanding the root causes of defects
Tianyi Zhang: Interactive Debugging and Testing Support for Deep Learning
DeepSeer: Interactive RNN Explanation and Debugging via State Abstraction / momentum-lab-workspace/DeepSeer
Eugenio Culurciello, The fall of RNN / LSTM. We fell for Recurrent neural networks…
by Eric Topol, When M.D. is a Machine Doctor.
Women in AI awards
Links
Flyte: An Open Source Orchestrator for ML/AI Workflows - The New Stack
Build production-grade data and ML workflows, hassle-free with Flyte* Cloud-Native Geospatial Foundation
The Cloud-Native Geospatial Foundation is a forthcoming initiative from Radiant Earth created to increase adoption of highly efficient approaches to working with geospatial data in public cloud environments.
fast.ai - Mojo may be the biggest programming language advance in decades
Nostr, a simple protocol for decentralizing social media that has a chance of working
Lilian Weng’s updated The Transformer Family Version 2.0
Sam Kriss, in All the nerds are dead, conflates geeks and nerds, but is funny anyway
The General Theory of Employment, Interest and Money by John Maynard Keynes
The reasonable(?) effectiveness of data analysis
Why is it that we can be thrown into the work of other people, in a field we have zero experience in, and have any expectation of making any useful impact at all? When stated objectively, it sounds utterly ridiculous. But in my experience, a data team can find something to make an improvement on, even if the impact can sometimes be small.
Tackling Collaboration Challenges in the Development of ML-Enabled Systems “I highlight the findings of a study on which I teamed up with colleagues Nadia Nahar (who led this work as part of her PhD studies at Carnegie Mellon University and Christian Kästner (also from Carnegie Mellon University) and Shurui Zhou (of the University of Toronto).The study sought to identify collaboration challenges common to the development of ML-enabled systems. Through interviews conducted with numerous individuals engaged in the development of ML-enabled systems, we sought to answer our primary research question: What are the collaboration points and corresponding challenges between data scientists and engineers? We also examined the effect of various development environments on these projects. Based on this analysis, we developed preliminary recommendations for addressing the collaboration challenges reported by our interviewees.”
Probability Is Not A Substitute For Reasoning – Ben Landau-Taylor
Self-Healing Concrete: What Ancient Roman Concrete Can Teach Us
Differentiating the discrete: Automatic Differentiation meets Integer Optimization | μβ
Information Transfer Economics: Organization of information equilibrium concepts
Serge Zaitsev, World’s smallest office suite
Annie Lowrey, We Haven’t Been Measuring How the Economy Really Works
Alex Komoroske, On Schelling Points in Organizations
Alex Komoroske, Coordination Headwind - How Organizations Are Like Slime Molds
Alternative to the tedious openhub workflow: analyzemyrepo.com | about
TIL Apophenia vs Pareidolia
Matthew Feeney, Markets in fact-checking
Étienne Fortier-Dubois, The elements of scientific style
Saloni Dattani, Real peer review has never been tried
Jason Collins, We don’t have a hundred biases, we have the wrong model
Schimmack on Psychological Science and Real World Racism
If there are already smarter people around, how can I find good ideas?
Are Model Explanations Useful in Practice? Rethinking How to Support Human-ML Interactions.
Colossal-AI is designed to be a unified system to provide an integrated set of training skills and utilities to the user. You can find the common training utilities such as mixed precision training and gradient accumulation. Besides, we provide an array of parallelism including data, tensor and pipeline parallelism. We optimize tensor parallelism with different multi-dimensional distributed matrix-matrix multiplication algorithm. We also provided different pipeline parallelism methods to allow the user to scale their model across nodes efficiently. More advanced features such as offloading can be found in this tutorial documentation in detail as well.
dynamicslab/pykoopman: A package for computing data-driven approximations to the Koopman operator.
Neural Transducer Training: Reduced Memory Consumption with Sample-wise Computation
Building Resilient Organizations: Toward Joy and Durable Power in a Time of Crisis
Is Anything Worth Maximizing? How metrics shape markets, how we’re… | by Joe Edelman
factorization_machine Something something kernels, something regression something interaction effects.
Facebook, Google Give Police Data to Prosecute Abortion Seekers
Cleanlab: “We publish research, develop open source tools, and design interfaces to help you improve the quality of your datasets and diagnose various issues in them.” See their blog e.g. ActiveLab: Active Learning with Data Re-Labeling
TL; DR—In-context learning is a mysterious emergent behavior in large language models (LMs) where the LM performs a task just by conditioning on input-output examples, without optimizing any parameters. In this post, we provide a Bayesian inference framework for understanding in-context learning as “locating” latent concepts the LM has acquired from pretraining data. This suggests that all components of the prompt (inputs, outputs, formatting, and the input-output mapping) can provide information for inferring the latent concept. We connect this framework to empirical evidence where in-context learning still works when provided training examples with random outputs. While output randomization cripples traditional supervised learning algorithms, it only removes one source of information for Bayesian inference (the input-output mapping).
Bayesian Neural Networks by Duvenaud’s team
Rohit, People always put their money in futures they predict
What have we seen so far? People didn’t use to have much disposable income to invest a century ago. When they did, or rather those who did, invested their savings mostly in land or (if they were rich enough) businesses, or commodities.
Where should I invest my money is a relatively old question, but until recently it wasn’t a very interesting question. This is because until recently the answers were understood, but not that actionable. The futures would get better, things would get built, and you could ride optimism as a thesis if you could find a way how. The avenues available were extremely limited, and the optionality you had was minimal.
Olúfẹ́mi O. Táíwò, Identity Politics and Elite Capture
What’s the difference between a tutorial and how-to guide? - Diátaxis
Instagram, TikTok, and the Three Trends
the company correctly intuited a significant gap between its users stated preference — no News Feed — and their revealed preference, which was that they liked News Feed quite a bit. The next fifteen years would prove the company right.
Kedro | A Python framework for creating data science code /Kedro Frequently asked questions. Kedro rationale by Joel Schwarzmann: The importance of layered thinking in data engineering
Darts:
Prof Steve Keen | Creating realistic economics for the post-crash world
Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications that they previously could not. But using these LLMs in isolation is often not enough to create a truly powerful app - the real power comes when you are able to combine them with other sources of computation or knowledge.
This library is aimed at assisting in the development of those types of applications.
How did places like Bell Labs know how to ask the right questions?
Color Oracle simulates color blindness for accessibility of visualisations and plots etc
darrenjw/fp-ssc-course: An introduction to functional programming for scalable statistical computing
Do organizations have to get slower as they grow? (with Alex Komoroske)
Kolibri is an open-source educational platform specially designed to provide offline access to a wide range of quality, openly licensed educational resources in low-resource contexts like rural schools, refugee camps, orphanages, and also in non-formal school programs.
What even are GFlownets?
Team Silverblue — About packaged apps for Fedora
The Adaptable Linux Platform Guide PAckaed apps for SUSE.
The Carr–Madan formula is really just a special case of a Taylor expansion. For completeness, let’s rederive the Taylor expansion with an integral remainder.
When explaining becomes a sin—by Tom Stafford file under taboos and Tetlock and compassion/comprehension
Karloo Pools and the hidden alternative swimming spots nearby—Walk My World
Cult Classic ’Fight Club’ Gets a Very Different Ending in China
A Turkish Farmer Tests Out VR Goggles on Cows To Get More Milk
How to buy a social network, with Tumblr CEO Matt Mullenweg—The Verge
Fake Feelings—ai emo. When post-hardcore emo band Silverstein… | by Dadabots—Medium
Why the super rich are inevitable
Meanwhile, the richer player will gain money. That’s because, from their perspective, every game they lose means they have an opportunity to win it back—and then some—in the next coin flip. Every game they win means, no matter what happens in the next coin flip, they’ll still be at a net-plus.
Repeat this process millions of times with millions of people, and you’re left with one very rich person.
Pluralistic: Tiktok’s enshittification (21 Jan 2023) – Pluralistic: Daily links from Cory Doctorow
Pluralistic: EU to Facebook, ’Drop Dead’ (07 Dec 2022) – Pluralistic: Daily links from Cory Doctorow
In Which Long-Time Netizen & Programmer-at-Arms Dave Winer Records a Podcast for Me, Personally
DRMacIver’s Notebook: Three key problems with Von-Neumann Morgenstern Utility Theory
The first part is about physical difficulties with measurement—you can only know the probabilities up to some finite precision. VNM theory handwaves this away by saying that the probabilities are perfectly known, but this doesn’t help you because that just moves the problem to be a computational one, and requires you to be able to solve the halting problem. e.g. choose between \(L_1=p B+(1-p) W\) and \(L_2=q B+(1-q) W\) where \(p=0.0 \ldots\) until machine \(M 1\) halts and 1 after and \(q\) is the same but for machine \(M 2\).
The second demonstrates that what you get out of the VNM theorem is not a utility function. It is an algorithm that produces a sequence converging to a utility function, and you cannot recreate even the original decision procedure from that sequence without being able to take the limit (which requires running an infinite computation, again giving you the ability to solve the halting problem) near the boundary.
Supervised Training of Conditional Monge Maps—Apple Machine Learning Research
How To Be an Academic Hyper-Producer—Economics from the Top Down
A global analysis of matches and mismatches between human genetic and linguistic histories—PNAS
Desmos—Let’s learn together. graphing calculator online
The Cause of Depression Is Probably Not What You Think—Quanta Magazine
What Monks Can Teach Us About Paying Attention—The New Yorker
Actually, Japan has changed a lot—by Noah Smith — japanese real estate is surprsising
One Useful Thing (And Also Some Other Things) | Ethan Mollick—Substack
The radical idea that people aren’t stupid paired with How to achieve self-control without “self-control”
Colonialism did not cause the Indian famines—History Reclaimed
Erik van Zwet, Shrinkage Trilogy Explainer on modelling the publication process
Mathematics of the impossible: Computational Complexity—Thoughts
Download the Atkinson Hyperlegible Font—Braille Institute What makes it different from traditional typography design is that it focuses on letterform distinction to increase character recognition, ultimately improving readability. We are making it free for anyone to use!
Low-Rank Approximation Toolbox: Nyström Approximation—Ethan Epperly
-ise or-ize? Is-ize American? (1/3) – Jeremy Butterfield Editorial
Iron deficiencies are very bad and you should treat them—Aceso Under Glass
The Australian academic STEMM workplace post-COVID: a picture of disarray
torchgeo—torchgeo 0.3.1 documentation/microsoft/torchgeo: TorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data
Merve Emre, Has Academia Ruined Literary Criticism?
Matt Clancy, Age and the Nature of Innovation “Are there some kinds of discoveries that are easier to make when young, and some that are easier to make when older”?
Tom Stafford, Microarguments and macrodecisions
Kevin Munger, Why I am (Still) a Conservative (For Now)
Kevin Munger, Facebook is Other People
Randy Au, in Data science has a tool obsession talks about Gear Acquisition Syndrome for data scientists.
Clive Thompson, The Power of Indulging Your Weird, Offbeat Obsessions
omg.lol - A lovable web page and email address, just for you
Donate to a highly effective charity - Effective Altruism Australia. Focussed on poverty and health interventions.
What are the best charities to donate to in 2023? · Giving What We Can
karpathy/nanoGPT: The simplest, fastest repository for training/finetuning medium-sized GPTs.
What is the “forward-forward” algorithm, Geoffrey Hinton’s new AI technique?
Simon Willison, AI assisted learning: Learning Rust with ChatGPT, Copilot and Advent of Code
Fission: Build the future of web apps at the edge incubates several decentralized protocols
danah boyd, What if failure is the plan?. I’ve been thinking a lot about failure…
Mastodon—and the pros and cons of moving beyond Big Tech gatekeepers
Michael Nielsen on science online
Great bloggers are rare, weird, and not team players – Kevin Drum
Swayable: RCTs for marketing campaigns via ingenious audience recruiting network
Zoomers Co-Working Community (co-working for accountability)
Normconf Lightning Talks/Normconf: The Normcore Tech Conference — a conference on the stuff that we actually need to do in ML, as opp. the stuff we would like to pretend is what we do.
Jean Gallier and Jocelyn Quaintance , Algebra, Topology, Differential Calculus, and Optimization Theory for Computer Science and Machine Learning, 2188 pages as of 2022/10/30, and growing.
Terence Eden, You can have user accounts without needing to manage user accounts
Adam Mastroianni, Ludwin-Peery, EJ, Things could be better
Adam Mastroianni, The great myths of political hatred
Big correlations and big interactions ([2105.13445] The piranha problem: Large effects swimming in a small pond)
How to keep cakes moist and cause the greatest tragedies of the 20th century
Distribution testing
GPflow/GeometricKernels: Geometric kernels on manifolds, meshes and graphs
George Ho, How to Improve Your Static Site's Typography (for code formatting)
Invasive Diffusion: How one unwilling illustrator found herself turned into an AI model
Microsoft CSR’s Law Enforcement Request Report is disconcerting transparency
Marc ten Bosch, Let's remove Quaternions from every 3D Engine (An Interactive Introduction to Rotors from Geometric Algebra)
Michele Coscia, Meritocracy vs Topocracy
oxcsml/riemannian-score-sde: Score-based generative models for compact manifolds
Public-facing Censorship Is Safety Theater, Causing Reputational Damage
Ti John’s Publications
Starboard, a shareable in-browser notebook that runs pyton (!)
Students Are Using AI to Write Their Papers, Because Of Course They Are
Treehugger Introduces a Modern Pyramid of Energy Conservation
Vast.ai “Rent Cloud GPU Servers for Deep Learning and AI”
Adam Mastroianni, Things could be better
Michael Burnam-Fink, What is Scientific about Data Science?
Christian Lawson-Perfect’s Interesting Esoterica is a collection of weird papers in maths.
Erik Hoel, Why do most popular science books suck?
Étienne Fortier-Dubois, The Vibes Are Off
George Ho, Understanding NUTS and HMC
Gordon Brander, Coevolution creates living complexity
Gordon Brander, Thinking together, on egregores, Dunbar numbers and information-processing thresholds in Holocene social evolution, all to motivate
Kate Mannell, Eden T. Smith Alternative Social Media and the Complexities of a More Participatory Culture: A View From Scuttlebutt
Peter Woit, Symmetry and Physics
Rob J Hyndman, We need more open data in Australia
Vicki Boykis, How I learn machine learning
Oshan Jarow, Markets Underinvest In Vitality
Spirals of Delusion: How AI Distorts Decision-Making and Makes Dictators More Dangerous (not convinced tbh)
Erik Hoel, The gossip trap
The Developer Certificate of Origin is a great alternative to a CLA
I. Risk Management Foundations - Machine Learning for Financial Risk Management with Python [Book]
jkbren/einet: Uncertainty and causal emergence in complex networks
Darren Wilkinson’s Bayesian inference for a logistic regression model 1, 2, 3, 4, 5
Book Review: Public Choice Theory And The Illusion Of Grand Strategy
Stephen Malina — Deriving the front-door criterion with the do-calculus
Census is a tool which links all the weird different data storage systems and CRM stuff
Michael Lewis podcast on illegible experts
Nemanja Rakicevic, NeurIPS Conference: Historical Data Analysis
Yanir Seroussi, The mission matters: Moving to climate tech as a data scientist
Keir Bradwell, #1: In-group Cheems
Samuel Moore, Why open science is primarily a labour issue
Adam Mastroianni, Against All Applications
Have The Effective Altruists And Rationalists Brainwashed Me?
Anthony Lee Zhang, The War for Eyeballs
Digital artists’ post-bubble hopes for NFTs don’t need a blockchain
Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation
I Do Not Think It Means What You Think It Means: Artificial Intelligence, Cognitive Work & Scale
ClearerThinking.org’s courses, e.g.
- Introduction to Decision Academy: The Science of Better Decisions
- Rhetorical Fallacies: Dodging Argument Traps
- Learning from Mistakes: A Systematic Approach
- Probabilistic Fallacies: Gauging the Strength of Evidence
- Explanation Freeze: Interpreting Uncertain Events
- Aspire: A Tool to Help You Improve Your Life
- The Sunk Cost Fallacy: Focusing on the Future
Reddit for AI-generated and manipulated content
PJ Vogt, Selling Drugs to Buy Crypto
Michele Coscia, Pearson Correlations for Networks
The DAIR Institute “The Distributed AI Research Institute is a space for independent, community-rooted AI research, free from Big Tech’s pervasive influence.”
Machine Learning Trick of the Day (1): Replica Trick— Shakir Mohammed
Machine Learning Trick of the Day (7): Density Ratio Trick— Shakir Mohammed
ApplyingML - Papers, Guides, and Interviews with ML practitioners
Ryan Broderick, We were the unpaid janitors of a bloated tech monopoly
fastdownload: the magic behind one of the famous 4 lines of code · fast.ai
Steven Buss, Politics for Software Engineers, Part 1, Part 3
Schneier, When AIs Start Hacking
Multimodal Neurons in Artificial Neural Networks/ Distill version of Multimodal Neurons in Artificial Neural Networks
Francis Bach Going beyond least-squares – II : Self-concordant analysis for logistic regression
On the Generalization Ability of Online Strongly Convex Programming Algorithms
Illustrations
Some of Tom Gauld’s caution signs
Not quite sure what to do with this incredible and no-longer-appropriate-for-promotions band photo, but wow, what a time capsule.
Homeless links
Bookmarked but where will they ever go?
Dispel your justification-monkey with a “HWA!” - Malcolm Ocean
Roger’s Bacon, Living and Dying with a Mad God
washable & breathable flexiOH cast adapts to the patient’s skin
‘We can continue Pratchett’s efforts’: the gamers keeping Discworld alive
AO3’s 15-year journey from blog post to fanfiction powerhouse - The Verge
today I took a desk lamp whose Halogen light had burned out, whose crappy transformer always made those bulbs sputter, and whose mildly art-deco appearance I’d always liked, and swapped it out to run an LED bulb off USB power. It took about an hour’s work to replace the light with an LED, the switch with a nice heavy clicky one and now the whole thing runs off USB-C instead of wall voltage. It emits no appreciable heat, and if these calculations are to be believed, will run for decades for a few cents per year, assuming I leave it on all the time.
I hadn’t really appreciated how big a deal USB-PD voltage negotiation was until I found out that the little chips that handle that negotiation are about the size of the end of a pencil, that if you include the USB-C port you can replace basically any low-voltage transformer with something smaller than a quarter.
The magic search string, if you want to try this yourself, is “usb-pd trigger module”,
vscode-paste-image/README.md at master · mushanshitiancai/vscode-paste-image
mhoye/awesome-falsehood: 😱 Falsehoods Programmers Believe in
Gary Brecher, The War Nerd: Taiwan — The Thucydides Trapper Who Cried Woof
Evidence of Fraud in an Influential Field Experiment About Dishonesty. Looks bad for Dan Ariely. Damn.
on programming humans (Amir’s work)
Communications' digital initiative and its first digital event
Playable Half Earth Socialism simulator
flatmax/vector-synth: Old 2002 era vector synth code based on XFig
Nick Chater, Would you Stand Up to An Oppressive Regime.
Lambda School’s Job Placement Rate May Be Far Worse Than Advertised
I would like to read the diaries of Usama ibn Munqidh
The latest target of China’s tech regulation blitz: algorithms
State Power and the Power Law, State Power and the Power Law 2
Yuling Yao, The likelihood principle in model check and model evaluation “We are (only) interested in estimating an unknown parameter \(\theta\), and there are two data generating experiments both involving \(\theta\) with observable outcomes \(y_1\) and \(y_2\) and likelihoods \(p_1\left(y_1 \mid \theta\right)\) and \(p_2\left(y_2 \mid \theta\right)\). If the outcome-experiment pair satisfies \(p_1\left(y_1 \mid \theta\right) \propto p_2\left(y_2 \mid \theta\right)\), (viewed as a function of \(\theta\) ) then these two experiments and two observations will provide the same amount of information about \(\theta\).”
Liquid Information Flow Control, a confidential computing DSL
Jag Bhalla, Vaccine Greed: Capitalism Without Competition Isn’t Capitalism, It’s Exploitation
Kostas Kiriakakis, A Day At The Park
By analyzing medical text and extracting biomedical entities and relations from the entire history of published medical science, Xyla can facilitate better real-world evidence-based clinical decision support and help make clinical research—such as research into new treatments, including de novo drug design as well as the repurposing of existing drugs—smarter and faster. In so doing, Xyla is fulfilling its mission of organizing the world’s medical knowledge and making it more useful.
My2050 calculator - create your pathway for the UK to be net zero by 2050
Is Pandemic Stress to Blame for the Rise in Traffic Deaths? Nope apparently it is decreased congestion making drivers drive faster on shit roads.
Marisa Abrajano has a provoking list of research topics. I would like to read the work to see her methodology.
Do normal people need to know or care about “the metaverse”?
Apple acquires song-shifting startup AI Music, here’s what it could mean for users
Black Americans are pessimistic about their position in U.S. society
Smart technologies | Internet Policy Review
Speaking of ‘smart’ technologies we may avoid the mysticism of terms like ‘artificial intelligence’ (AI). To situate ‘smartness’ I nevertheless explore the origins of smart technologies in the research domains of AI and cybernetics. Based in postphenomenological philosophy of technology and embodied cognition rather than media studies and science and technology studies (STS), the article entails a relational and ecological understanding of the constitutive relationship between humans and technologies, requiring us to take seriously their affordances as well as the research domain of computer science. To this end I distinguish three levels of smartness, depending on the extent to which they can respond to their environment without human intervention: logic-based, grounded in machine learning or in multi-agent systems. I discuss these levels of smartness in terms of machine agency to distinguish the nature of their behaviour from both human agency and from technologies considered dumb. Finally, I discuss the political economy of smart technologies in light of the manipulation they enable when those targeted cannot foresee how they are being profiled.
Concurrent programming, with examples
Mention concurrency and you’re bound to get two kinds of unsolicited advice: first that it’s a nightmarish problem which will melt your brain, and second that there’s a magical programming language or niche paradigm which will make all your problems disappear.
We won’t run to either extreme here. Instead we’ll cover the production workhorses for concurrent software – threading and locking – and learn about them through a series of interesting programs. By the end of this article you’ll know the terminology and patterns used by POSIX threads (pthreads).
A study of lights at night suggests dictators lie about economic growth
DIY Collective Embeds Abortion Pill Onto Business Cards, Distributes Them At Hacker Conference
Penny Wyatt, Developer Innovation and the Free Puppy
Elizabeth Van Nostrand, A Quick Look At 20% Time
Chalk is a non-terrible calculator for macos, incorporating useful things like matrices and bitwise ops
2 comments
Fredrick Alonso
Barbra Bugnion