1 There's a chasm between an NLP technology that works well in the research lab and something that works for applications that real people use. This was eye-opening when I started my career, and every time I talk to an NLP engineer at @textio, it continues to strike me even now.

2 Research conditions are theoretical and/or idealized. A huge problem for so-called NLP or AI startups with highly credentialed academic founders is that they bring limited knowledge of what it takes to build real products outside the lab.
3 A product is ultimately a thing that people pay for - not just cool technology or user experience. But I’m not even talking about knowledge gaps in go-to-market work. I'm talking purely technical gaps: how you go from science project to performant + delightful user experience.
4 Most commoditized NLP packages solve well-understood problems in standard ways that sacrifice either precision or performance. In a research lab, this is not usually a hard trade-off; in general, no one is using what you make, so performance is less important than precision.
5 In software, when you’re making something for real people to use, these tradeoffs are a big deal. Especially if you’re asking those people to pay for what you’ve made (can’t get away from that pesky GTM thinking). They expect quality, which includes precision AND performance.
6 Example: Let’s say you’re trying to do something simple and commoditized, like implement a grammar checker. (I’ll pause while someone argues with me, but I stand by it: grammar checking is a commodity offering, not a commercial one.)
7 Grammar checkers have historically been rule-based, which means that someone can sit down + write a dozen/hundred/thousand rule-based statements that capture the system you want to implement. But not all rules are created equal!
8 You can choose a small number of rules that account for the majority of grammar mistakes that people make. By keeping the rule set small, you can make sure the system works faster - it won’t take huge swaths of time to calculate errors and suggestions across an entire document.
9 But by choosing a small set of grammar rules, you end up with a long tail of mistakes that profoundly erodes user confidence in your system overall. You may catch 80% of the errors with 5% of the rules, but the 20% you mischaracterize makes the user think your system is trash!
10 By contrast, implementing thousands of rules gets you awesome precision. But how long do all these rules make it take to grammar-check someone's real documents? You may get all the grammar right, but your app's performance erodes user confidence anyway.
11 All this is for a "simple," commoditized feature… not so simple, even with rules, and even for something commoditized that everyone expects to "just work." Now let’s say you’re NOT implementing a grammar checker as a be-all, end-all, but as a component of a larger system.
12 The complexity that exists in your grammar checker exists across your system, and further, all the libraries you use (build, buy, or borrow) have to interact with each other… further slowing your system down and/or compromising the precision of one part in service of another.
13 You only encounter these issues as a production NLP engineer. They don’t come up in the research lab. Which is why it takes so long for great research to impact real products (which again, are things people pay for). And why so many researchers do not enjoy industry work.
14 Thanks to @kwhumphreys who inspired me thinking down this path today and who solves these problems for us every day! 🎉

More from Machine learning

10 PYTHON 🐍 libraries for machine learning.

Retweets are appreciated.
[ Thread ]


1. NumPy (Numerical Python)

- The most powerful feature of NumPy is the n-dimensional array.

- It contains basic linear algebra functions, Fourier transforms, and tools for integration with other low-level languages.

Ref:
https://t.co/XY13ILXwSN


2. SciPy (Scientific Python)

- SciPy is built on NumPy.

- It is one of the most useful libraries for a variety of high-level science and engineering modules like discrete Fourier transform, Linear Algebra, Optimization, and Sparse matrices.

Ref: https://t.co/ALTFqM2VUo


3. Matplotlib

- Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python.

- You can also use Latex commands to add math to your plot.

- Matplotlib makes hard things possible.

Ref: https://t.co/zodOo2WzGx


4. Pandas

- Pandas is for structured data operations and manipulations.

- It is extensively used for data munging and preparation.

- Pandas were added relatively recently to Python and have been instrumental in boosting Python’s usage.

Ref: https://t.co/IFzikVHht4

You May Also Like

Rig Ved 1.36.7

To do a Namaskaar or bow before someone means that you are humble or without pride and ego. This means that we politely bow before you since you are better than me. Pranipaat(प्राणीपात) also means the same that we respect you without any vanity.

1/9


Surrendering False pride is Namaskaar. Even in devotion or bhakti we say the same thing. We want to convey to Ishwar that we have nothing to offer but we leave all our pride and offer you ourselves without any pride in our body. You destroy all our evil karma.

2/9

We bow before you so that you assimilate us and make us that capable. Destruction of our evils and surrender is Namaskaar. Therefore we pray same thing before and after any big rituals.

3/9

तं घे॑मि॒त्था न॑म॒स्विन॒ उप॑ स्व॒राज॑मासते ।
होत्रा॑भिर॒ग्निं मनु॑षः॒ समिं॑धते तिति॒र्वांसो॒ अति॒ स्रिधः॑॥

Translation :

नमस्विनः - To bow.

स्वराजम् - Self illuminating.

तम् - His.

घ ईम् - Yours.

इत्था - This way.

उप - Upaasana.

आसते - To do.

स्त्रिधः - For enemies.

4/9

अति तितिर्वांसः - To defeat fast.

मनुषः - Yajman.

होत्राभिः - In seven numbers.

अग्निम् - Agnidev.

समिन्धते - Illuminated on all sides.

Explanation : Yajmans bow(do Namaskaar) before self illuminating Agnidev by making the offerings of Havi.

5/9
First thread of the year because I have time during MCO. As requested, a thread on the gods and spirits of Malay folk religion. Some are indigenous, some are of Indian origin, some have Islamic


Before I begin, it might be worth explaining the Malay conception of the spirit world. At its deepest level, Malay religious belief is animist. All living beings and even certain objects are said to have a soul. Natural phenomena are either controlled by or personified as spirits

Although these beings had to be respected, not all of them were powerful enough to be considered gods. Offerings would be made to the spirits that had greater influence on human life. Spells and incantations would invoke their


Two known examples of such elemental spirits that had god-like status are Raja Angin (king of the wind) and Mambang Tali Arus (spirit of river currents). There were undoubtedly many more which have been lost to time

Contact with ancient India brought the influence of Hinduism and Buddhism to SEA. What we now call Hinduism similarly developed in India out of native animism and the more formal Vedic tradition. This can be seen in the multitude of sacred animals and location-specific Hindu gods