The story is the thing itself. Greater than the sum of its parts. When a story works, theme arises interstitially like harmony, and you are convinced because of the story’s spiritual truth and internal logic.
My friend, you lost.
More from All
How can we use language supervision to learn better visual representations for robotics?
Introducing Voltron: Language-Driven Representation Learning for Robotics!
Paper: https://t.co/gIsRPtSjKz
Models: https://t.co/NOB3cpATYG
Evaluation: https://t.co/aOzQu95J8z
🧵👇(1 / 12)
Videos of humans performing everyday tasks (Something-Something-v2, Ego4D) offer a rich and diverse resource for learning representations for robotic manipulation.
Yet, an underused part of these datasets are the rich, natural language annotations accompanying each video. (2/12)
The Voltron framework offers a simple way to use language supervision to shape representation learning, building off of prior work in representations for robotics like MVP (https://t.co/Pb0mk9hb4i) and R3M (https://t.co/o2Fkc3fP0e).
The secret is *balance* (3/12)
Starting with a masked autoencoder over frames from these video clips, make a choice:
1) Condition on language and improve our ability to reconstruct the scene.
2) Generate language given the visual representation and improve our ability to describe what's happening. (4/12)
By trading off *conditioning* and *generation* we show that we can learn 1) better representations than prior methods, and 2) explicitly shape the balance of low and high-level features captured.
Why is the ability to shape this balance important? (5/12)
Introducing Voltron: Language-Driven Representation Learning for Robotics!
Paper: https://t.co/gIsRPtSjKz
Models: https://t.co/NOB3cpATYG
Evaluation: https://t.co/aOzQu95J8z
🧵👇(1 / 12)
Videos of humans performing everyday tasks (Something-Something-v2, Ego4D) offer a rich and diverse resource for learning representations for robotic manipulation.
Yet, an underused part of these datasets are the rich, natural language annotations accompanying each video. (2/12)
The Voltron framework offers a simple way to use language supervision to shape representation learning, building off of prior work in representations for robotics like MVP (https://t.co/Pb0mk9hb4i) and R3M (https://t.co/o2Fkc3fP0e).
The secret is *balance* (3/12)
Starting with a masked autoencoder over frames from these video clips, make a choice:
1) Condition on language and improve our ability to reconstruct the scene.
2) Generate language given the visual representation and improve our ability to describe what's happening. (4/12)
By trading off *conditioning* and *generation* we show that we can learn 1) better representations than prior methods, and 2) explicitly shape the balance of low and high-level features captured.
Why is the ability to shape this balance important? (5/12)
You May Also Like
MDZS is laden with buddhist references. As a South Asian person, and history buff, it is so interesting to see how Buddhism, which originated from India, migrated, flourished & changed in the context of China. Here's some research (🙏🏼 @starkjeon for CN insight + citations)
1. LWJ’s sword Bichen ‘is likely an abbreviation for the term 躲避红尘 (duǒ bì hóng chén), which can be translated as such: 躲避: shunning or hiding away from 红尘 (worldly affairs; which is a buddhist teaching.) (https://t.co/zF65W3roJe) (abbrev. TWX)
2. Sandu (三 毒), Jiang Cheng’s sword, refers to the three poisons (triviṣa) in Buddhism; desire (kāma-taṇhā), delusion (bhava-taṇhā) and hatred (vibhava-taṇhā).
These 3 poisons represent the roots of craving (tanha) and are the cause of Dukkha (suffering, pain) and thus result in rebirth.
Interesting that MXTX used this name for one of the characters who suffers, arguably, the worst of these three emotions.
3. The Qian kun purse “乾坤袋 (qián kūn dài) – can be called “Heaven and Earth” Pouch. In Buddhism, Maitreya (मैत्रेय) owns this to store items. It was believed that there was a mythical space inside the bag that could absorb the world.” (TWX)
1. LWJ’s sword Bichen ‘is likely an abbreviation for the term 躲避红尘 (duǒ bì hóng chén), which can be translated as such: 躲避: shunning or hiding away from 红尘 (worldly affairs; which is a buddhist teaching.) (https://t.co/zF65W3roJe) (abbrev. TWX)
2. Sandu (三 毒), Jiang Cheng’s sword, refers to the three poisons (triviṣa) in Buddhism; desire (kāma-taṇhā), delusion (bhava-taṇhā) and hatred (vibhava-taṇhā).
These 3 poisons represent the roots of craving (tanha) and are the cause of Dukkha (suffering, pain) and thus result in rebirth.
Interesting that MXTX used this name for one of the characters who suffers, arguably, the worst of these three emotions.
3. The Qian kun purse “乾坤袋 (qián kūn dài) – can be called “Heaven and Earth” Pouch. In Buddhism, Maitreya (मैत्रेय) owns this to store items. It was believed that there was a mythical space inside the bag that could absorb the world.” (TWX)