Anshul Kundaje (anshulkundaje@bluesky) Profile Banner
Anshul Kundaje (anshulkundaje@bluesky) Profile
Anshul Kundaje (anshulkundaje@bluesky)

@anshulkundaje

Followers
22,582
Following
2,439
Media
194
Statuses
36,800

Genomics, Machine Learning, Statistics, Big Data and Football (Soccer, GGMU). Post: @anshulkundaje , Threads: anshulkundaje

Stanford, CA
Joined July 2006
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
Pinned Tweet
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
10 days
Dear big consortia, It is never too late to be brave and use all that visibility you have to make a strong statement that academia will not be held hostage by glam journals & their shiny JIFs 1/
4
13
101
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
6 years
Repeat after me: A reviewers role is POLITELY point out flaws in manuscripts and provide HELPFUL suggestions with the goal of helping authors IMPROVE their manuscript. The goal is not to find some random reason to reject the paper. And definitely not a license to be an ahole.
28
286
1K
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
11 months
All postdocs in my lab are paid in the range of 85-95K. One of the reasons I can't go beyond is (reverse) equity concerns from HR. I would ideally pay them 105-115K. I think that's how much I could afford based on funding I can pull in & reducing lab size to a functional minimum.
@toomanyspectra
Seven Machina Rasmussen
11 months
oh you’re hiring a grad student/postdoc? don’t be shy babe show us the salary
17
104
963
36
84
991
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
12 days
Oh nice! @Nature publishes another amazing advertisement for an amazing product masquerading as a scientific publication with no code, no model & a webserver that allows 10 queries at a time from a billion dollar company. But it's "gold" open access folks! Rejoice!
15
91
808
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
3 months
To all budding compbio & ML folks interested in bio: Don't just only run behind the latest ML model hype train. The greatest long run impact will come by really assimilating prior bio/compbio literature with the goal of really understanding strategies for how to model biology. 1/
8
129
784
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
3 years
Postdocs deserve far more than we're usually paying them (especially those who end up at the big shot schools in the big shot towns and cities). Let's at least try to make an effort to do better before we lose them all to industry.
24
33
634
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
1 year
Its finally happened! #Ascension (Figure 1: what @midjourney_ai claims one is supposed to feel like when promoted with tenure😆) 1/
Tweet media one
147
16
566
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
4 years
After 4 months of effort (2 months of very disciplined effort - 2 hours every night), I've finished my second manual scan of the entire GRCh38 human genome assembly + 900 DNase tracks (processed 3 different ways) + Input DNA tracks repeat masker. For what you ask?
20
77
564
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
1 month
I've noticed this over the years. We've had amazing high schoolers (HSs) join us every summer. But over time, I've noticed a change in priorities. 10 years ago most HS applicants were just super gungho about learning & doing fun research. 1/
@shortstein
Thomas Steinke
1 month
2019: Need NeurIPS papers to get postdoc/faculty position. 2024: Need NeurIPS papers to get PhD student position. 2029: Need NeurIPS papers to get into undergrad program. 2034: Admission to NeurIPS onsite daycare requires accepted paper.
16
179
1K
5
34
549
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
2 years
2021 was an "interesting" financial year for the lab. We started off with a nice surplus (have been lucky to have healthy finances from 2014-2021). Then 2 things change 1/
20
67
509
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
2 years
Request to biologists collaborating with computational groups: please don't encourage our trainees to sacrifice computational rigor because you want to "get your paper out". We don't encourage your trainees not to do rigorous control expts. so we can get faster data access. 1/
5
76
505
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
3 years
Our paper out in @NatureGenet uses CRISPR essentiality screen data to map gene function! Another beautiful collab w/ Mike Bassik, led by the outstanding @michaelwainberg , @RoarkeKamber , @_bakshay with invaluable contribs from coauthors. 1/
Tweet media one
@NatureGenet
Nature Genetics
3 years
🔥HOT OFF THE PRESS @NatureGenet by @anshulkundaje Michael C. Bassik and colleagues: 🧬A genome-wide atlas of co-essential modules assigns function to uncharacterized genes |
5
108
332
7
145
496
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
3 years
ML models in genomics are now routinely capable of generating 100-1000s of robust hypotheses - many/most of which are novel. It takes like months to validate one or a few of these & requires diverse expertise to make this happen 1/
10
73
481
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
7 years
Sad, ridiculous, annoying to see PIs arguing 60K is too high a postdoc salary. No wonder we can't retain top talent in academia.
22
105
450
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
2 years
And now we are 4! Our baby arrived 4 weeks early! Was chaos, but baby and mum are doing great. Very excited 🙂 Wish you all the infinite joy we're feeling right now! #midjourney 's imagination of baby and mum based on my description. Remarkable likeness.
Tweet media one
75
1
438
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
3 years
Why do we have so few academic labs explicitly focused on scientific visualization and interfaces in genomics/biology? Or maybe I just know too few of these. Please enlighten.
40
40
390
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
3 months
Great application of DNA language models to prokaryotic genomes. Excellent impactful application use cases of the model. Congrats to all the authors! But a quick answer to the first question. No - DNA is not all you need. 1/
@pdhsu
Patrick Hsu
3 months
Is DNA all you need? In new work, we report Evo, a genomic foundation model that learns across the fundamental languages of biology: DNA, RNA, and proteins. Evo is capable of both prediction tasks and generative design, from molecular to whole genome scale.
39
442
2K
4
61
399
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
5 months
Nice. About time. I paid PDs 70K in 2017. I now pay 90-95K. Aiming to deprecate the PD position completely in the next 4 years - will aim to hire all PDs & longer term staff scientists as staff with 115-120K salaries. (Still crap for Bay Area) 1/
@k_langin
Katie Langin
5 months
An NIH advisory group announced today that they're recommending the agency boost its minimum #postdoc salary to $70,000, up from the current $56,484. Many postdocs feel "underpaid and overworked," a co-chair said. My latest for @ScienceInsider :
35
547
3K
11
28
351
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
2 years
A quick word of advice to all budding scientists. Learning how to do science and how to communicate science (writing, visuals and presentation) are often the hardest skills with steep learning curves. So best to start sharpening these skills early. 1/
6
43
341
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
2 years
I know this incident makes me look like a fool. I don't care. Am revealing this as a word of warning to new PIs. The academic system has no guardrails. Little structured knowledge passed on. I learned everything by trial and error. Please don't make same mistakes I made. 11/11
28
9
337
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
5 years
I'm surprised, honored and humbled. Thank you so much! I look forward to the meeting in April.
@humangenomeorg
Human Genome Org
5 years
HUGO would like to congratulate Dr. Ansul Kundaje @anshulkundaje for being awarded one of the two 2019 HUGO #ChenAward of Excellence! Don’t miss his plenary lecture on April 26th at #HGM2019 .
Tweet media one
3
7
58
47
20
331
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
1 month
More recently, it's been a lot more about whether they can publish something, ideally first author, win top science competitions etc. It's not really their fault. But they're being trained to bean count when they're still in their diapers. Very sad to see. 2/
2
6
320
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
6 months
This is a nice summary of next-generation deep learning architectures for sequence modeling & recent work that have attempted to use these for modeling DNA sequence. But, IMO, it is premature to claim that these are the future of genomics. 1/
@BoWang87
Bo Wang
6 months
Interested in LLMs for genomic research but don't know where to start? looking for a review/survey to get started in this field? 👇👇😀 I am very excited to share that our review paper titled "To Transformers and Beyond: Large Language Models for the Genome" is now available as
Tweet media one
Tweet media two
Tweet media three
15
290
1K
4
66
319
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
1 month
I ran away from the academic system in my home country to the US decades ago precisely cuz the bean counting used to make me seriously sick. But here we are again! The premature over achiever pandemic has very much reached these shores. 4/4
6
5
303
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
8 months
Excited that we will be co-leading the DAC with @ZhipingWeng (fearless leader) @MooreJillE & @thabangh for the new Multiomics for heath and disease consortium.
23
35
299
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
5 months
Long but slow thread incoming on "DNA language models". I have a lot to say but not much time. Can't post a giant thread all at once. Will post as I get time. Goal is to trigger critical discussion & to focus efforts towards useful endpoints instead of meaningless benchmarks. 1/
8
38
288
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
1 year
We published a paper "Integrative single-cell analysis of cardiogenesis identifies developmental trajectories and non-coding mutations in congenital heart disease" 1/ DOI:
8
63
282
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
2 years
I think we're grossly underestimating the importance of systematically profiling genome wide TF binding in large numbers of cell contexts. With ENCODE ending, are there other efforts doing this on a massive scale? If TF enthusiasts are interested in teaming up, I'm in.
18
33
276
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
4 years
In the last 1 hour, I've come across 20 must-read papers and preprints (from my perspective) in ML and Bio. WTF people. Slow down.
6
11
250
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
10 months
This is unfortunately going to be one of those classics talked about in every lecture on pitfalls of ML for bio. Ufff! Data processing is absolutely key. And any time one gets insanely high accuracy from an ML model in real life applications, its usually too good to be true.
@StevenSalzberg1
Steven Salzberg 💙💛
10 months
Major, fatal errors found in the data and methods of a 2020 paper in @Nature , including millions of reads mis-identified as bacteria. The "cancer microbiome" in this study was simply not there. @abrahamgihawi @elapertea @YuchenGe1 @JenniferLu717
40
504
2K
5
39
243
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
5 years
Too many awesome new datasets, biological findings, ML methods every second of every day. And they are spawning too many new ideas. I luv science & I'm going crazy (in a good way) that I can't keep up with all the coolness and just have one brain and two hands and no time. Fk!!!!
6
28
240
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
4 years
Our BPNet paper been thru 2 review rounds (10 months) now rejected cuz two reviewers appear to be in eternal love with PWMs😂. But wait. U can still read/use/cite it ! Thank u @biorxivpreprint for existing!
23
30
238
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
9 months
I have often wondered why there aren't more catering services for scientific meetings in the US that specialize in Indian or Asian fast food / appetizers. It would be just as cheap (probably cheaper) & be far more tasty than the current status quo.
26
2
230
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
5 years
We developed high performance multimodal ResNet architectures to predict genome wide chromatin accessibility across cell types using cis sequence and gene expression. 1/
@generegulation
Gene Regulation
5 years
Integrating regulatory DNA sequence and gene expression to predict genome-wide chromatin accessibility across cellular contexts cis and trans regulation of chromatin dynamics across 123 diverse cellular contexts. #ChromDragoNN code:
Tweet media one
0
30
87
2
90
223
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
1 year
New Years Resolutions 1. Sleep more ... A lot more 2. Don't work over the weekends, no matter what 3. Reduce time spent in meetings by half 4. Double time spent writing 5. Keep travel to bare minimum
7
6
218
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
6 years
Machine learning and genomics/biology both progressing at a breathless pace. Crazy times trying to keep up with the literature and code and contribute to it. Feels like it just makes sense for ML+genomics academic labs to coordinate and collaborate a lot more with each other.
9
57
212
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
4 years
Incredibly proud of DR. @avshrikumar for successfully defending her PhD thesis covering all her incredibly innovative research on interpretation of ML models in genomics and beyond. 🎉🎉🎉🎉🎉 She's amongst the best scientists I have ever worked with.
10
7
211
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
2 years
This is an excellent paper for multiple reasons. Firstly, very important problem and a very significant contribution scientifically. But I'm going to focus on another aspect. IMHO, it's an excellent case study showcasing best practices for papers using ML for genomics. 1/
@vagar112
Vikram Agarwal (@vagar.bsky.social)
2 years
Super excited to finally release my work with @drklly @calico ! mRNA degradation rate is a fundamental property of its metabolism. It has been notoriously difficult to predict mRNA half-life from its sequence, and characterize factors regulating it. 1/4🧵
6
115
540
1
29
211
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
2 years
I think this is just the right time for the brilliant forward looking folks behind @biorxivpreprint to set up something equivalent to Twitter for the academic community focused on science and preprint comm & discussions. (Eg. Maybe building on Mastodon). 1/
12
24
209
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
3 months
So I guess the conclusion is that we should absolutely never use 2D visualization methods to visualize high D data because they are always misleading. All the alternatives r also misleading so we shud not use those as well. Best not to visualize data. Just stick with p-values.😜
24
17
210
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
5 years
For many international students who are not from highly reputable institutions (like me), MS programs are key entry point into the 'system'. Intl students don't have equal access to research opportunities. Look inferior on paper. But they can kick ass given the chance (MS).
@OmnesResNetwork
Dread Pirate Jordan
5 years
I'm starting to think that master's programs are simply scams that prey on wealthy students trying to pad their resumes.
5
2
26
12
26
208
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
3 years
Pandemic time course: Very poor productivity the first 6 months (particularly during school closure) -> managed to adapt to ~40% usual level by working at odd hours which was very exhausting -> now pandemic is "ending" and I am feeling super-duper burned out. I am a dumbass.
16
5
209
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
2 years
This is true for grad students as well. Eg. It takes a lot of work for pure CS or pure bio undergrads to transition into serious computational biologists. It's important to be prepared for a steep learning curve & have patience. 1/
@ItaiYanai
Itai Yanai 💔
2 years
Postdocs: You will pay a price for switching fields, but you can more than make up for that with the ability to recognize a hidden connection to spark a discovery (cancer -> development; immunology -> evolution; genetics -> physics..).
21
65
800
6
29
205
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
2 years
(1) The cloud is really designed to suck finances from customers. Very poor cost management tooling especially for academic labs who live on tight budgets. I used to be quite bullish about cloud. Learned my lesson the hard way. 9/
6
37
204
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
7 months
Greatest thing to have happened for science in the last decade.
@biorxivpreprint
bioRxiv
7 months
bioRxiv is almost 10 years old and medRxiv just turned 4! 🎉 Tell us how we’ve done so far & how to keep improving by answering the bioRxiv & medRxiv survey! Deadline: December 10th, 11:59 PM EST Length: 15-20 mins 🔗 Thank you! #biorxiv #medrxiv #preprints
Tweet media one
5
86
177
5
34
200
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
3 years
We just adopted this guy. Gonna be so much fun.
Tweet media one
8
3
201
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
6 years
What is it with people who have never used or familiar with neural networks constantly cribbing about how 'black box' they are. Go read some literature on the topic. Seriously.
14
40
198
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
1 month
I mean these kids will end up being more successful in the system, just like the ugrads who have stacks of NeurIPs papers before they get into top grad programs & schools. But the bean counting doesn't go away. It just gets more ingrained in most cases. 3/
2
4
194
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
7 months
Check out our latest work led by @suragnair & @immoameen using neural nets to dissect the interplay of TF stoichiometry, motif affinity & syntax in fibroblast reprogramming to iPSCs. Several fascinating insights from the models. Tweetorial coming soon 1/
@TF_binding_bot
TF binding papers 📚 NucPosDB database
7 months
Transcription factor stoichiometry, motif affinity and syntax regulate single-cell chromatin dynamics during fibroblast reprogramming to pluripotency
0
3
14
6
58
189
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
6 years
ML/AI folks: Bio is one of the most exciting domains for ML: unique challenges, hard important problems, huge potential impact (it's a marathon though .. not a sprint). The Stats community embraced bio as a core domain decades ago. Time for u to take the dive as well.
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
6 years
@NipsConference shockingly decided to axe the ML for Compbio (MLCB) workshop (the only one for bio applications of ML) this year. MLCB been active for over a decade. Record submissions last year. Kinda insane decision.
32
38
80
3
53
188
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
4 years
Great news today. Our brilliant colleague and friend Mike Bassik just got promoted with Tenure at Stanford Genetics. @StanfordMed And a postdoc in my lab just had a baby. First lab baby (excluding my own). 🎉🎉🎉
4
6
187
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
2 years
This video is being mocked by several academics. If an expert said such things MAYBE they might deserve some jostling. I fail to understand the benefit in mocking a rich guy excited about funding spatial proteomics (in whatever way he understood it)!
@cziscience
CZI Science
2 years
We’re excited to announce that the @ChanZuckerberg Institute for Advanced Biomedical Imaging’s initial focus will be to identify and map the position of every protein in a cell. Learn more from our co-founder’s exclusive @Medscape interview:
39
70
294
10
9
184
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
3 years
Covid19 is ravaging India. Very sad to see. We personally know of so many people who have gotten sick and so many who have died. There was overconfidence in unfounded claims of innate immunity & other voodoo. Facing horrible consequences. Response seems extremely inadequate.
3
31
184
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
10 months
Grad students: Ruuuuun. Ruuuuun away from @arjunrajlab . He's the Voldemort of academic PIs cuz he prefers using his & your time more efficiently. He actually meets random students on Zoom to provide all kinds of career advice. WTFFFFFF! Ruuuuun!
@arjunrajlab
Arjun Raj
10 months
Not sure who is giving this guidance, but PhD students: we don't need to meet to discuss whether or not I should be on your thesis committee. Just send a description of what you do and why I might be helpful. I can decide based on that.
142
11
536
4
3
183
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
5 years
I'm extremely super-excited about this work. Tweetorial coming soon!
@Avsecz
Žiga Avsec
5 years
Our pre-print "Deep learning at base-resolution reveals motif syntax of the cis-regulatory code" is out! Try training and interpreting BPNet on your own genomic tracks . Thx @anshulkundaje @ZeitlingerLab @MelanieWeilert @avshrikumar
Tweet media one
Tweet media two
7
119
239
2
46
184
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
5 years
The interdisciplinary curse: When u submit a paper w/ novel modeling & novel bio findings to a 'high profile journal' & get reviewers who don't understand the novelty and significance of the models (rev1) or the bio (rev3) or both (rev2) 😫😩😭😡
12
14
177
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
1 year
Question: So let's say I synthesize a piece of DNA, not found in the human genome, dump it next to Myc and it affects the growth rate of a cancer cell line, is this piece of DNA 'functional'?
43
21
170
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
9 months
A quick side note. Folks have given @ENCODE_NIH & consortia a hard time based on claims that the entire effort was a waste. This very cool paper is yet another reminder that ENCODE data has literally powered the deep learning revolution in genomics. 1/
@drklly
David Kelley
9 months
Check our new paper “Predicting RNA-seq coverage from DNA sequence as a unifying model of gene regulation”.
11
141
474
5
26
174
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
8 months
Two amazing postdocs in my lab have been stuck for months renewing their visas (one stuck in India and the other stuck in China). They will finally be returning soon. So much productivity and time lost. Again totally unacceptable! Morons run the US visa & immigration system.
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
8 months
Insanity that immigrants that are likely to be strong contributors to US economy & society continue to be treated like trash. Distinctly remember the extremely demeaning (felt like I was livestock) interview with "visa officers" when applying for a student visa years ago.
3
8
102
13
14
172
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
5 years
I'm starting to lose track of all the highly nebulous 3D genome terms TADs, microTADs, subTADs, domains, loops, contacts, CRDs. I love reading about this stuff but it's literally impossible to know exactly how these terms map to each other in different papers. Please standardize.
14
19
171
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
5 years
One trick to get around imposter syndrome, is to just be comfortable being an imposter. Whether you are truly an imposter (like me) or not (like all my supersmart colleagues), it will not matter :)
7
4
171
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
2 months
Excited to be funded by @BiswasFamilyFdn to develop a chatbot to facilitate genetic diagnosis of cardiovascular diseases. Their Transformative Computational Biology Grant Program aims to advance computational approaches in research & clinical practice. 1/
Tweet media one
9
13
168
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
4 years
Hi All. The new manually annotated GRCh38 blacklist is now at the ENCODE portal (README w/ brief description at bottom of page) All feedback welcome. Especially on any regions that might have been missed.
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
4 years
THE BLACKLIST! New manually annotated version that reconciles 3 auto-generated blacklists. Releasing soon at an @EncodeDCC portal near you!
4
10
98
3
47
163
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
11 months
Either way, academic salaries are garbage. There is no denying that.
1
4
162
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
5 years
If you work on any kind of predictive models in genomics, please read this paper very carefully. Extremely important to consider how you slice and design your training, hyperparam tuning and test dataset. And very imp. to compare to the per-locus average activity strong baseline.
@jmschreiber91
Jacob Schreiber
5 years
Happy to share new work on a pitfall you can fall into if you train ML models to predict across cell types. TL;DR, always compare your predictions to the per-locus average activity, it's a hard baseline to beat! @uwescience @uwgenome @uwcse @EncodeDCC
2
70
167
1
68
155
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
2 years
Decoding the genome
Tweet media one
6
7
158
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
1 month
"caQTLs and haQTLs capture regulatory variations not associated with eQTLs and explain ∼49% of the functionally annotated GWAS loci" Been clear for a while that accessibility & histone marks provide more info for explaining & fine mapping GWAS loci than expression. 1/
@biorxivpreprint
bioRxiv
1 month
Multi-omic QTL mapping in early developmental tissues reveals phenotypic and temporal complexity of regulatory variants underlying GWAS loci #bioRxiv
0
6
40
4
34
153
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
6 months
Stopped reading after the first line. Utterly ridiculous hubris to even utter these words "we have a cure for cancer". Marketing pitches from these CEO types make me really want to puke.
@emigal
Emi Gal
6 months
I believe we have a cure for cancer. Early detection. Here’s how @ezrainc plans to detect cancer early for everyone in the world (the secret Ezra master plan): Step 1 (done): Launch a 60-minute full-body MRI that screens for cancer and 500 other conditions in up to 14
Tweet media one
102
33
417
11
4
157
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
3 months
Another quick note. One of the (ex-) students in the lab quickly tested how promoter GC content (% G/Cs) correlates with expression in the benchmark dataset. GC content = 0.58. zero-shot LLM = 0.41. Well motivated baselines are important. I will keep harping on this forever! 1/
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
3 months
While incredibly impressive, lets just take the zero-shot gene expression prediction performance of the model (Spearman 𝑟 = 0.41). This is extremely low for a prokaryotic genome 2/
2
1
41
6
19
154
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
2 years
COVID finally found us after 3 years. We've been down and out since last week. Not sure what variant hit us but it wasn't "mild". Was particularly worried about the wife (and our baby that is due in a month). But it looks like they pulled through fine. On the path to recovery 1/
21
0
152
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
11 months
And here we go. Closed source foundation models in biology starting to pop up. Really important for academia to mobilize & develop open source (code & parameters) competitors ideally through industry partnerships. We really should avoid closed source monopolies.
@thesteinegger
Martin Steinegger 🇺🇦
11 months
100B parameter protein language model trained on @uniprot and the #ColabFoldDB using 768 NVIDIA A100 GPUs for several months. The LM shows significant improvements in most prediction categories. Note: the model is not open-source; only the training data is currently available.
Tweet media one
1
27
155
8
21
150
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
3 years
Great collaborative effort by the four co-first authors under the leadership of @WJGreenleaf @PascaStanford . Here I'll just focus on the insights from our neural net models trained on the gorgeous scATAC-seq data for prioritizing non-coding autism mutations. 1/
@Sergiu_P_Pasca
Sergiu P. Pasca
3 years
Our work on human cortical development is in @cell today. Effort to understand the logic of lineage progression & map #autism mutations Collab work with @WJGreenleaf Led by👏: Alex @atrev_bio & Fabian ( @EpigenomeI ) & Jimena ( @jimena_andersen ) & Laksshman
Tweet media one
6
152
675
1
30
150
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
27 days
Cool work. But what is it with these BULLSHIT headlines. Seriously. "Can AI rewrite our genome". No it cannot. And it's one thing when the "press" do it. It's another thing when the scientists do it themselves. WHY?? Isn't the actual science cool enough?
@thisismadani
Ali Madani
27 days
Can AI rewrite our human genome? ⌨️🧬 Today, we announce the successful editing of DNA in human cells with gene editors fully designed with AI. Not only that, we've decided to freely release the molecules under the @ProfluentBio OpenCRISPR initiative. Lots to unpack👇
284
997
4K
11
6
148
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
7 months
It is becoming amply clear the Gaza hospital explosion had absolutely nothing to do with IDF & entirely due to dumbfk terrorists shooting failed rockets into their own backyard. Maybe we shud all wait before reinforcing our priors, cuz misinfo can result in further escalation. 1/
23
19
144
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
3 months
Can someone point to some single cell papers that don't use 2D projection visualizations? I'm very curious if see alternative visualizations. I expect there should be a ton considering the immense outpouring of support for not using 2D projections.
19
15
146
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
3 years
The biggest lesson we still haven't learned from the pandemic is that the world is extremely connected and the only way to save ourselves is to save everyone. A truly united response. If there is a deadlier pandemic in the future, we will be surely doomed.
2
22
145
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
7 months
I'm not going to be tweeting much science over this week. I am in no mood to discuss genomes when the world is falling apart and people can't distinguish their head from their arse. So please feel free to mute until Sunday if you don't want to hear me vent my frustration.
7
6
143
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
3 years
Over 1000 participants registered for #MLCB2020 (Nov 23-24) ! The zoom webinar has a 1000 participant limit. If you haven't registered yet or are on the waitlist, don't fret. We will provide a YouTube live stream link on the MLCB site 30 min before start.
4
10
142
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
2 years
We've tried SO many fancy multi-task optimizers published in top ML conf. venues for "simple" regulatory genomics tasks and so far none of them can beat the simple: training naive multi-task model -> fine tune on each task.
@y0b1byte
yobibyte
2 years
🚨 Paper Alert The multitask learning 📔📜 claims that naive grad summation is weaker than ad-hoc optimizers due to task interference/grad conflicts. We show that you can get the same results by coupling it with plain old techniques, e.g. regularisation 🧵
Tweet media one
7
54
285
1
29
138
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
4 years
Check out latest work, led by the brilliant Alex Tseng w/ @avshrikumar on using Fourier based attribution priors to stabilize & improve feature attribution scores derived from neural network models in regulatory genomics. 1/
5
26
139
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
2 years
Had a really tough 2 months on multiple fronts that have really caused a serious dent in my productivity. Switched my schedule temporarily (more late nights) to try to shake things up. Thankfully, it looks like it's working. Finally getting some momentum back! 1/
11
1
139
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
2 years
Very excited that "Dynseq" tracks will soon be supported at the UCSC @GenomeBrowser as well. Below is a preview. Dynseq tracks are just bigwig tracks but they get visualized as dynamic sequence where the heights of bases are proportional to the signal at each position . 1/
Tweet media one
4
23
139
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
2 years
Made Bhel puri for dinner! Can smell the Indian Ocean whenever I have Bhel Puri.
Tweet media one
6
3
138
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
3 years
Congratulations to all the IGVF Consortium awardees! . Fortunate to also get our IGVF predictive modeling grant funded with an amazing group of collaborators @alexisjbattle @jkpritch @sbmontgom @LivnatJerby & @jengreitz 1/
7
11
137
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
10 months
When I was in school & college, I loved biology at least partly because of all of the beautiful figures of cells & organelles in some of the fabulous books. But this is just next-level. I don't know how any young person can resist diving deep into the beauty & mystery of life.
@SmartBiology3D
Smart Biology
10 months
By allowing students to truly understand and appreciate the incredible molecular machines inside cells and the lifeforms they manifest, we are bringing biology to life and changing how the world understands biology. This is why we do what we do. Visit
38
1K
6K
3
16
138
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
10 months
This kind of situation has always been my worst nightmare. I've often held back papers from my lab by months when I have suspected issues & until I'm convinced that I've done everything in my power to vet it. 1/
@slavov_n
Prof. Nikolai Slavov
10 months
The investigation concluded that Tessier-Lavigne created a culture conducive to data manipulation. That's awful. I would go further: A PI who cannot identify inappropriate data manipulation on a project they 'lead' is PI in name only. He is not the principal investigator.
14
10
86
3
23
136
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
4 months
The worst thing about this new Twitter/X is all the rando nude accounts that keep following, liking & retweeting posts. I block at least 10 accounts every day that have no content other than semi-nude pics & redirection to some voyeur websites. What is this shit?!? @elonmusk
24
7
133
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
2 years
New lab pic! Look at all these fabulous genome explorers! Missing a few. But was great to have a nice outdoor picnic and hike.
Tweet media one
2
1
137
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
3 years
Two exciting papers on long-range sequence models for gene regulation over the last week or so 1. Graph attention networks 2. Transformer models 1/
@vagar112
Vikram Agarwal (@vagar.bsky.social)
3 years
Excited to announce our new behemoth of a model: the Enformer! Unified access to gene expression, TF binding, & chrom mark prediction; enhancer-promoter inference; & nc-variant effect prediction - all from a DNA sequence! Great collab with @Avsecz @drklly
6
95
293
2
28
135
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
13 days
This is a cool result. I think its another nail in the coffin for the claims that additive models are sufficient to model regulatory DNA. It is amply clear that cooperative interactions are extensive & in this case even important for understanding variant effects.
@LaylaSiraj
Layla Siraj, Ph.D.
13 days
@julirsch @jraylab @jeffvierstra @drklly But what about variants working together (regulatory epistasis)? We tested 2,522 pairs of fine-mapped variants from the same CRE, identifying 180 pairs with non-additive effects that were driven in part by proximity. 9/n
Tweet media one
2
3
32
4
22
134
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
2 years
Excellent work. A great decomposable approach for first order PWM motif discovery using neural additive models. A fast, powerful, scalable, predictive replacement for classic motif discovery tools (e.g. MEME/HOMER etc). 1/
@manusaraswat10
Manu Saraswat
2 years
Want to apply #deeplearning in your #regulatory #genomics analyses but overwhelmed by suite of interpretation tools? Check out ExplaiNN: interpretable and transparent neural networks for genomics with @NovakovskyG @ofornes @sara_mostafavi @WyWyWa 🧵👇
Tweet media one
4
64
243
2
23
134
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
1 year
New lab pic (5 members missing). Love this group of nice, helpful, caring, brilliant scientists.
Tweet media one
2
3
134
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
3 years
Thank you to the US govt. for extending a helping hand to India. This is the only way forward. All people and nations are one in this crisis. Let's make sure we continue this trend for all the nations that need us.
@VincentRK
Vincent Rajkumar
3 years
BREAKING NEWS: US will help India immediately with vaccine raw materials, emergency supplies, testing kits, and more. Thanks @ASlavitt for sharing. 👇👇
Tweet media one
12
113
466
3
14
132
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
4 years
Happy Thanksgiving to everyone. This year I'm particularly thankful for my lovely family + incredible lab members + brilliant, supernice students/PDs/colleagues @Stanford + generous collaborators + a largely supportive & super nice community of scientists
1
3
131
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
2 years
5 am on a Saturday blissfully enjoying feeding your new born, u get a congratulatory reminder from LinkedIn about your work anniversary ... 1/
5
7
131
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
6 months
I'm obviously very bullish about the incredible potential of ML in biology but I have a feeling there will be plenty of great low hanging fruits, followed by a humbling & tough road ahead with many dead ends & so close but yet so far situations. Exciting nonetheless.
@GeneInvesting
Gene Investing w/Anthony 🧬
6 months
Seems like a consensus is forming amongst many smart people! This is Jensen Huang, CEO of $NVDA Basically WATCH OUT for the impending biotech BOOM 💥
133
835
4K
3
4
129
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
11 months
This is a really beautiful paper & highlights an innovative strategy: train & query a powerful unconstrained deep learning model to learn syntax rules -> inspires design of a simpler architecture constrained DNN that approximates the signal well + efficient to interpret 1/
@KDudnyk47866
Kseniia Dudnyk
11 months
Excited to share our new work about the sequence basis of transcription initiation in human. We showed that simple rules can explain most human promoters, offering fresh insights into transcription initiation and gene regulation
Tweet media one
5
85
349
1
19
130
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
1 year
This recent talk features ChromBPNet, our upcoming flagship predictive sequence model of base-resolution chromatin accessibility profiles (ATAC-seq, DNase-seq, scATAC-seq cluster pseudobulks) with built-in de-novo enzyme bias correction. 1/
@EalesJames
𝕁ames 𝔼ales
1 year
Can’t recommend this talk by @anshulkundaje enough. Also thanks to @broadinstitute for sharing on YouTube
1
10
42
1
23
130
@anshulkundaje
Anshul Kundaje (anshulkundaje@bluesky)
2 years
Just submitted the last recommendation letter of the season (this year was my record I think with over 100 rec letter submissions!). So much brilliance and talent from high schoolers to tenure trackers. The future of science in academia and industry is very bright.
4
2
126