AI https://old.t2.sa/en en GPT-3 and Conversational AI Models https://old.t2.sa/en/blog/GPT-3 <span>GPT-3 and Conversational AI Models</span> <span><span>zrik</span></span> <span>Mon, 01/25/2021 - 10:20</span> <div class="field field--name-body field--type-text-with-summary field--label-hidden field--item"><p class="text-align-justify">Conversational AI is all about making machines communicate in as much human as it can with the understanding of natural language. They can be chatbots, voice bots, virtual assistants, and more. In reality, they may be slightly different from each other based on the language and data they trained on. However, one key feature that ties them all together is their ability to understand natural language and respond to it.</p> <p class="text-align-justify">GPT-3 (Generative Pre-trained Transformer 3) is a language model made by OpenAI a research lab that’s funded by big tech companies like Microsoft and Google. It was released to the public through an API in July 2020. It’s based on a famous deep learning architecture called transformers published in 2015 and It's generative because unlike other neural networks that spit out a numeric score or a yes or no answer, it can generate long sequences of the original text as its output.</p> <p class="text-align-justify">GPT-3 can do a lot of stuff like question answering, summarizing articles, information retrieval, it can also provide you with code snippets! What makes it unique It’s the size and the development it went through which made it the closest model to human performance.</p> <p class="text-align-justify"> </p> <h3 class="text-align-justify"><strong>How to make your GPT</strong></h3> <p class="text-align-justify"><br /> To create our GPT language model we need to find out what do we need and how to do it. As we know, the main building blocks for deep learning are the dataset and computing power. So, let’s see some examples of that.</p> <p class="text-align-justify">Starting with the dataset used to train GPT-3. OpenAI Has collected the data from the internet which generated about 499 billion tokens(word) comparing to GPT-2 which trained with 10 billion tokens which were about 40GB. That makes GPT-3 trained with a total of 2TB of data. Here is the breakdown of the data:</p> <table border="1" cellpadding="1" cellspacing="1" style="width: 500px;"><tbody><tr><td class="text-align-center" style="width: 250px;"><strong>DataSet</strong></td> <td class="text-align-center" style="width: 237px;"><strong># Tokens (Billions)</strong></td> </tr><tr><td class="text-align-center" style="width: 250px;">Total</td> <td class="text-align-center" style="width: 237px;">499</td> </tr><tr><td class="text-align-center" style="width: 250px;">Common Crawl (filtered by quality)</td> <td class="text-align-center" style="width: 237px;">410</td> </tr><tr><td class="text-align-center" style="width: 250px;">Web Text2</td> <td class="text-align-center" style="width: 237px;">19</td> </tr><tr><td class="text-align-center" style="width: 250px;">Books 1</td> <td class="text-align-center" style="width: 237px;">12</td> </tr><tr><td class="text-align-center" style="width: 250px;">Books 2</td> <td class="text-align-center" style="width: 237px;">55</td> </tr><tr><td class="text-align-center" style="width: 250px;">Wikipedia</td> <td class="text-align-center" style="width: 237px;">3</td> </tr></tbody></table><p class="text-align-justify"><br /> Continuing with computing power, OpenAI found that to do well on their increasingly large datasets, they had to add more and more weights. The original Transformer from Google (BERT) had 110 million weights. GPT-1 followed this design. With GPT-2 It went up to 1.5 billion weights. With the latest GPT-3, the number of parameters has reached 175 billion, making GPT-3 the biggest neural network in the world.<br /> Having a model with 175 billion weights is not a simple operation of increasing parameters but it becomes an incredible exercise in parallel computer processing. You can see how much it compares to others from figure 1. </p> <p class="text-align-justify"><img alt="GPT3-1" class="img-responsive" data-entity-type="file" data-entity-uuid="738190df-50d1-4c30-a9d9-b21ced29f1e4" src="/sites/default/files/inline-images/image1.png" width="1200" height="673" loading="lazy" /></p> <p class="text-align-center"><br /> Figure 1. GPT-3 training chart compared to others <br /> Source from <a href="https://www.zdnet.com/article/what-is-gpt-3-everything-business-needs-to-know-about-openais-breakthrough-ai-language-program/">here</a></p> <p class="text-align-justify">It hasn't described the exact computer configuration used for training, but others mentioned it was on a cluster of Nvidia V100 chips running in Microsoft Azure. The company described the total compute cycles required, stating that it is the equivalent of running one thousand trillion floating-point operations per second per day for 3,640 days. To make it simpler to imagine it. It would take about 3 years for the V100(highest GPU available) with an estimated cost of 4.6$ million for a single training run.</p> <p class="text-align-justify">That’s not the only problem for the computing power but also the capacity to carry a 175 billion weight parameter each parameter is a float number of sizes 32bit which requires in total about 700GB of GPU Ram. 10 times more than the memory on a single GPU.</p> <p class="text-align-justify">If you still up for the challenge and want to experiment with a GPT model. You can try GPT-Neo from <a href="https://github.com/EleutherAI/gpt-neo">here</a>. it’s an implementation of model &amp; data-parallel GPT2 &amp; GPT3-like models, with the ability to scale up to full GPT3 sizes (and possibly more!), using the mesh-TensorFlow library.</p> <p class="text-align-justify">But is it really worth it? We will discuss next the limitation of it.</p> <h3 class="text-align-justify"><br /><strong>Limitations of GPT-3</strong></h3> <p class="text-align-justify">So what can GPT-3 do? Well, for one thing, it can answer questions on any topic while retaining the context of previous questions asked. We can see in figure 2 examples of questions GPT-3 got right.</p> <p class="text-align-justify"> </p> <p class="text-align-justify"><img alt="GPT3-2" class="img-responsive" data-entity-type="file" data-entity-uuid="d4e73a06-67b3-4d2a-99e7-26224edc8b08" src="/sites/default/files/inline-images/image2.PNG" width="1133" height="487" loading="lazy" /></p> <p class="text-align-center"> Figure 2. GPT-3 answering correctly <br /> Source from <a href="https://lacker.io/ai/2020/07/06/giving-gpt-3-a-turing-test.html">here</a></p> <p class="text-align-justify"> </p> <h4 class="text-align-justify"><strong>So, when does the GPT-3 model fails?</strong></h4> <p class="text-align-justify">GPT-3 knows how to have a normal conversation. but It doesn’t quite know how to say “Wait a moment your question is nonsense.” It also doesn’t know how to say “I don’t know.” in figure 3</p> <p class="text-align-justify"><img alt="GPT3-3" class="img-responsive" data-entity-type="file" data-entity-uuid="a64c571d-d12d-46ae-849f-a5329e3106c7" src="/sites/default/files/inline-images/image3_0.png" width="880" height="374" loading="lazy" /></p> <p class="text-align-justify"> </p> <p class="text-align-center">Figure 3. GPT-3 answering nonsense questions<br /> Source from <a href="https://lacker.io/ai/2020/07/06/giving-gpt-3-a-turing-test.html">here</a></p> <p class="text-align-justify"> </p> <p class="text-align-justify">People are used to computers being super smart at logical activities, like playing chess or adding numbers. It might come as a surprise that GPT-3 is not perfect at simple math the questions from figure 4. </p> <p class="text-align-justify"><img alt="GPT3-4" class="img-responsive" data-entity-type="file" data-entity-uuid="f74d469e-390c-40c2-8e87-e7bf7c048e3b" src="/sites/default/files/inline-images/image4.PNG" width="1117" height="483" loading="lazy" /></p> <p class="text-align-center">Figure 4. GPT-3 confused from simple math<br /> Source from <a href="https://lacker.io/ai/2020/07/06/giving-gpt-3-a-turing-test.html">here</a></p> <p class="text-align-justify">This problem shows up in more human questions as well, if you ask it about the result of a sequence of operations. You can see that in figure 5 </p> <p class="text-align-justify"><img alt="GPT3-5" class="img-responsive" data-entity-type="file" data-entity-uuid="f39842c9-7460-4ef2-9b7e-dc3cb37fb89e" src="/sites/default/files/inline-images/image5.PNG" width="1125" height="410" loading="lazy" /></p> <p class="text-align-center">Figure 5. GPT-3 fails on sequence of operations.<br /> Source from <a href="https://lacker.io/ai/2020/07/06/giving-gpt-3-a-turing-test.html">here</a></p> <p class="text-align-justify"><br /> It’s like GPT-3 has limited short-term memory, and has trouble reasoning about more than one or two objects in a sentence.</p> <p class="text-align-justify">The biggest problem for GPT-3 is being a black box like most neural networks, it’s so captivating because it can answer such a vast array of questions right, but it also gets quite a bit of them wrong, as we previously saw. The problem is when GPT-3 falls short, there is no way of debugging it or pinpoint the source of the error. Any customer-facing interface which cannot be iterated and revised is unsustainable or scalable in a business environment. </p> <p class="text-align-justify">This is another aspect in which the current conversational AI solutions are better to GPT-3. Even simple chatbots allow their users to alter and improve their conversational flows as needed. In the case of more sophisticated conversational AI interfaces, users not only have a clear snapshot of the error but can track down, diagnose, and remedy the issue instantly.</p> <h3 class="text-align-justify"><br /><strong>Conclusion</strong></h3> <p class="text-align-justify"><br /> GPT-3 is quite impressive in some areas, and still clearly subhuman in others but it’s far from ready to be used on real products, and as for businesses looking to provide their audiences with engaging, timely, and helpful conversational experiences will continue to rely on existing conversational AI solutions.</p> <p class="text-align-justify">I still hope that with a better understanding of its strengths and weaknesses, we data scientists will be better equipped to use modern language models like GPT-3 in a business environment with real products.</p> <h4 class="text-align-justify"><br /><strong>Resources:</strong></h4> <p class="text-align-justify"> </p> <ul><li class="text-align-justify"><a href="https://en.wikipedia.org/wiki/GPT-3">https://en.wikipedia.org/wiki/GPT-3</a></li> <li class="text-align-justify"><a href="https://lambdalabs.com/blog/demystifying-gpt-3/">https://lambdalabs.com/blog/demystifying-gpt-3/</a></li> <li class="text-align-justify"><a href="https://www.zdnet.com/article/what-is-gpt-3-everything-business-needs-to-know-about-openais-breakthrough-ai-language-program/">https://www.zdnet.com/article/what-is-gpt-3-everything-business-needs-to-know-about-openais-breakthrough-ai-language-program/</a></li> <li class="text-align-justify"><a href="https://arxiv.org/abs/2005.14165">https://arxiv.org/abs/2005.14165</a></li> <li class="text-align-justify"><a href="https://lacker.io/ai/2020/07/06/giving-gpt-3-a-turing-test.html">https://lacker.io/ai/2020/07/06/giving-gpt-3-a-turing-test.html</a><br />  </li> </ul></div> <div class="field field--name-field-media-single field--type-entity-reference field--label-above"> <div class="field--label">Banner image</div> <div class="field--item"><a href="/en/media/396" hreflang="en">GPT-3.png</a></div> </div> <section> </section> <div class="field field--name-field-tags field--type-entity-reference field--label-above"> <div class="field--label">Tags</div> <div class="field--items"> <div class="field--item"><a href="/en/taxonomy/term/53" hreflang="en">GPT-3</a></div> <div class="field--item"><a href="/en/taxonomy/term/54" hreflang="en">GPT</a></div> <div class="field--item"><a href="/en/taxonomy/term/38" hreflang="en">AI</a></div> <div class="field--item"><a href="/en/taxonomy/term/55" hreflang="en">AI Models</a></div> <div class="field--item"><a href="/en/taxonomy/term/56" hreflang="en">Conversational</a></div> </div> </div> <div class="field field--name-field-author field--type-entity-reference field--label-above"> <div class="field--label">Author</div> <div class="field--item"><a href="/en/node/118" hreflang="en">Mahmoud Ayman Sayed</a></div> </div> Mon, 25 Jan 2021 07:20:49 +0000 zrik 119 at https://old.t2.sa Introduction to Artificial Intelligence https://old.t2.sa/en/blog/Introduction-to-AI <span>Introduction to Artificial Intelligence</span> <span><span>zrik</span></span> <span>Wed, 01/06/2021 - 17:56</span> <div class="field field--name-body field--type-text-with-summary field--label-hidden field--item"><p class="text-align-justify">For sure you have heard the term “Artificial Intelligence” (AI) here or there, very repeatedly. They said it solves many problems, might take the human position in the future.<br /> For what they are speaking about it day and night? for what reason we need it? what is it? How did it start?</p> <h3 class="text-align-justify"><strong>Why do we need AI?</strong></h3> <p class="text-align-justify">In this world we have lots of problems that we wish from the bottom of our hearts that it could be solved. Let’s concentrate on some of them here: Investigating the medical errors in USA shows that 5% of adults are misdiagnosed yearly. <a href="https://www.washingtonpost.com/national/health-science/20-percent-of-patients-with-serious-conditions-are-first-misdiagnosed-study-says/2017/04/03/e386982a-189f-11e7-9887-1a5314b56a08_story.html">This misdiagnosis is the reason for 10% of patients’ deaths</a>. Diagnosis is hard, experienced doctors are not always available and affordable, and will not forget the human errors. Moreover, doctors have their reasons to make errors with high pressure daily tasks.</p> <p class="text-align-justify">Going to another disaster happening in the world, buildings collapsing. Please have a look on this chart that shows the <a href="https://scroll.in/article/668636/across-india-2600-people-die-every-year-in-building-and-other-structural-collapses.">number of deaths cause by structural collapses in India during the interval 2003-2012</a>, it exceeds 2,000 persons. Based on the five reasons mentioned in <a href="https://www.bbc.com/news/world-africa-36205324">this article</a>, we can tell that the lack of experience, human errors, and cheating are the main reasons for building collapses around the world.</p> <p class="text-align-justify"> </p> <p class="text-align-justify"><img alt="AI-Figure-Number-of-Death" class="img-responsive" data-entity-type="file" data-entity-uuid="2483378c-7bc7-4439-8a4c-3c9414a3f9e8" src="/sites/default/files/inline-images/Figure%201%20Number%20of%20deaths%20cause%20by%20structural%20collapses%20in%20India%20during%20the%20interval%202003-2012.png" width="1200" height="676" loading="lazy" /><br />  </p> <p class="text-align-justify">We all know about the high rate of deaths resulted by car accidents yearly, and very clearly <a href="http://www.kostelecplanning.com/on-the-9th-day-of-safety-myths-my-dot-gave-to-me-94-percent/">94% of car accidents are caused by human choices or errors</a>. </p> <p class="text-align-justify">Yes, the dream came true with AI. With AI we can get personalized-healthcare and <a href="https://www.jax.org/personalized-medicine/precision-medicine-and-you/what-is-precision-medicine#:~:text=Personalized%20medicine%2C%20because%20it%20is,predict%20susceptibility%20to%20disease">person-centered medicine</a>. AI will not only be capable of <a href="https://www.sciencedirect.com/science/article/pii/S2212420920314126#sec3">solving building collapses</a> problem but also will make it possible that each one of us could <a href="https://azati.ai/artificial-intelligence-in-building-and-construction/">afford one</a> of <a href="https://en.wikipedia.org/wiki/Zaha_Hadid">Zaha Hadid</a> designs. <a href="https://www.synopsys.com/automotive/what-is-autonomous-car.html">Self-driving cars</a> usage in USA <a href="https://www.tesla.com/en_JO/VehicleSafetyReport">decreased the accidents count by 90%</a>. </p> <p class="text-align-justify">AI solves the problems and comes with extra features that we used to dream in. You will not be forced to waste your time again in shopping, AI will pick exactly the suit that you want for that specific ceremony. AI will also pick the missing food from refrigerator, based on your needs. AI will go beyond that, by reserving your wife's best resort at your anniversary. You do not need to be worry again about forgetting that date, AI will take care about you.  <br /> To sum up what AI can do for us:<br /> •    Make the most experienced human skills affordable by all human beings<br /> •    Saving our time, by performing the tedious work tailored by our needs and wishes</p> <p class="text-align-justify">So, what is AI? and how it could solve our problems in this smart way?</p> <h3 class="text-align-justify"><br /><strong>What is AI?</strong></h3> <p class="text-align-justify">We, humans, do like humans. Therefore, we worked on making computers think and learn like us. AI in terms, is the field that enables computers to solve human problems using the smartest human methods. The previous methods used to solve problems with computers was done by programming each case with the related required action. However, this is an impossible task with cases that has various probabilities, and sometimes hidden ones. Therefore, guiding the computers moved from feeding it with instructions to feeding it with examples. Then, the computer job will be to find the reasons for each scenario based on statistics and probabilities. And provide us with its conclusions and do the suitable action for that specific scenario. <br /> When AI research started? by whom? and how did progress till now?</p> <h3 class="text-align-justify"> </h3> <h3 class="text-align-justify"><strong>AI History</strong></h3> <p class="text-align-justify">AI research started many decades ago. <a href="https://en.wikipedia.org/wiki/Alan_Turing">Alan Turing</a> described AI literally in the same way it works nowadays in a public lecture in 1947: “What we want is a machine that can learn from experience” and that the “possibility of letting the machine alter its own instructions provides the mechanism for this.”.  <br />  </p> <p class="text-align-justify"><img alt="Alan-Turing" class="img-responsive" data-entity-type="file" data-entity-uuid="42237acb-8ed0-412c-8578-cf9fb651dc22" src="/sites/default/files/inline-images/alan_tuing.png" width="1242" height="330" loading="lazy" /></p> <p class="text-align-justify"> </p> <p class="text-align-justify">Turing also proposed a test for identifying the expected intelligent machine if it has reached to the level that he suggested or not. This test is called <a href="https://plato.stanford.edu/entries/turing-test/">Turing Test</a>. There are currently some machines passed the Turing test. I expect that you have used one at least once. It is the <a href="https://www.getjenny.com/what-is-a-chatbot">chatbot</a>. Chatbot model is a great example of AI machines that replies on human questions in a way that we will not know if it is a human or a machine reply. This was Turing test proposal.</p> <p class="text-align-justify">Yes, you are right AI research has started many decades ago. Why only recently we started hearing about it, repeatedly? AI was growing during the past century very slowly and had stopped many times, (many AI winters). The two main reasons for proceeding slowly are the two lacks in: the high-performance machines and the data. In the last two decades with the rise of the interactive sites as social media, there is no lack in data anymore. And the high-performance machines became not only available but affordable to lots of scientists and startups across the world. The following infographic shows the ebbs and flows in AI history. More information about AI history can be found in “<a href="http://sitn.hms.harvard.edu/flash/2017/history-artificial-intelligence/">The history of Artificial Intelligence</a>”. <br />  </p> <p class="text-align-justify"><img alt="AI-Figure-History" class="img-responsive" data-entity-type="file" data-entity-uuid="e3b80043-c497-4e93-93e1-0a4e7e2efdd8" src="/sites/default/files/inline-images/Figure%202%20AI%20History%20Infographic.jpeg" width="1920" height="1080" loading="lazy" /></p> <p class="text-align-justify"> </p> <h3 class="text-align-justify"><strong>AI Types</strong></h3> <p class="text-align-justify">AI is categorized to two types, based on machine performance in various intelligent tasks, not in one specific task. As both types supposed to exceed human capabilities in one specific task at least.</p> <p class="text-align-justify"><br /> 1.    <a href="https://www.investopedia.com/terms/w/weak-ai.asp">Weak AI or Narrow AI (ANI)</a>:<br /> It is the AIs that solves one specific task. It is needed for the time-consuming tasks and the tasks that we are incapable of solving because of the complex relations between the cause and result. We have many of weak AIs in these days, starting from <a href="https://towardsdatascience.com/a-brief-introduction-to-intent-classification-96fda6b1f557">predicting your next word</a> when you write on your smartphone, and ending with the <a href="https://www.synopsys.com/automotive/what-is-autonomous-car.html">autonomous vehicles (self-driving cars)</a>.</p> <p class="text-align-justify">2.    <a href="https://www.ibm.com/cloud/learn/strong-ai">Strong AI or Artificial General Intelligence (AGI)</a>:<br /> It is still a theoretical AI type. As it is the form of human like intelligent machines. AGI mimics the human in terms of self-consciousness, reasoning, solving the problems, acting, and planning. If you want to imagine how AGI will perform in the future <a href="https://en.wikipedia.org/wiki/I,_Robot_(film)">I, Robot</a> movie might help.</p> <p class="text-align-justify">We still do not know how AI learns from examples, this will be our next topic. It is Machine Learning. Stay tuned!</p> <p class="text-align-justify"> </p> <p class="text-align-justify"><img alt="AI-Figure-vs-ML" class="img-responsive" data-entity-type="file" data-entity-uuid="b6ff675a-c8b2-41b2-90d8-8cad1dac4af5" src="/sites/default/files/inline-images/Figure3_AI_DL_ML.png" width="525" height="542" loading="lazy" /></p> <p class="text-align-justify"> </p> <h3 class="text-align-justify"><strong>Read More</strong></h3> <p class="text-align-justify">1.    Why is artificial intelligence important? <a href="https://www.sas.com/en_us/insights/analytics/what-is-artificial-intelligence.html">https://www.sas.com/en_us/insights/analytics/what-is-artificial-intelligence.html</a><br /> 2.    Intro to AI: <a href="https://www2.slideshare.net/ankit_ppt/lesson-1-intro-to-ai-132216012?from_action=save">https://www2.slideshare.net/ankit_ppt/lesson-1-intro-to-ai-132216012?from_action=save</a><br /> 3.    AI History: <a href="http://sitn.hms.harvard.edu/flash/2017/history-artificial-intelligence/">http://sitn.hms.harvard.edu/flash/2017/history-artificial-intelligence/</a><br /> 4.    AI Types: <a href="https://www.ibm.com/cloud/learn/strong-ai#toc-strong-ai--YaLcx8oG">https://www.ibm.com/cloud/learn/strong-ai#toc-strong-ai--YaLcx8oG</a><br /> 5.    Machine Learning<br />  </p> </div> <div class="field field--name-field-media-single field--type-entity-reference field--label-above"> <div class="field--label">Banner image</div> <div class="field--item"><a href="/en/media/380" hreflang="en">AI_323829966.jpeg</a></div> </div> <section> <h2>Add new comment</h2> <drupal-render-placeholder callback="comment.lazy_builders:renderForm" arguments="0=node&amp;1=113&amp;2=field_comments&amp;3=comment" token="tPUddAMN3gIwQE0eWvURxiy7bYt9MRO-BczsTlMADf8"></drupal-render-placeholder> </section> <div class="field field--name-field-tags field--type-entity-reference field--label-above"> <div class="field--label">Tags</div> <div class="field--items"> <div class="field--item"><a href="/en/taxonomy/term/38" hreflang="en">AI</a></div> <div class="field--item"><a href="/en/taxonomy/term/39" hreflang="en">Artificial Intelligence</a></div> <div class="field--item"><a href="/en/taxonomy/term/40" hreflang="en">Artificial</a></div> <div class="field--item"><a href="/en/taxonomy/term/41" hreflang="en">Intelligence</a></div> <div class="field--item"><a href="/en/taxonomy/term/42" hreflang="en">Machine Learning</a></div> <div class="field--item"><a href="/en/taxonomy/term/43" hreflang="en">Machine</a></div> <div class="field--item"><a href="/en/taxonomy/term/44" hreflang="en">Learning</a></div> </div> </div> <div class="field field--name-field-author field--type-entity-reference field--label-above"> <div class="field--label">Author</div> <div class="field--item"><a href="/en/node/112" hreflang="en">Esra&#039;a Bani-Issa</a></div> </div> Wed, 06 Jan 2021 14:56:23 +0000 zrik 113 at https://old.t2.sa