Since OpenAI Last June it was announced that users could request access to the GPT-3 API, a machine learning toolset, to help OpenAI explore the strengths and limitations of the new technology.
The GPT-3 from OpenAIFounded in 2015 with $ 1 billion from investors such as Elon Musk, the company is the third generation of the large language model with a capacity two orders of magnitude larger– –100 times– –compared to its predecessor GPT-2. GPT-3 has a capacity of 175 billion machine learning parameters. That’s ten times bigger than the next big language model, Microsoft’s Turing Natural Language Generator (NLG) Wikipedia.
Some researchers have warned of the possible harmful effects of GPT-3. Gary Marcus, author, entrepreneur, and professor of psychology at New York University, published a report with Ernest Davis in MIT Technology Review last August with the headline: “GPT-3, Bloviator: OpenAIs The speech generator has no idea what it is talking about. “He specifically cited and complained about a lack of understanding OpenAI had not given his team access to research to study the model.
Some are given access. One of them was Sahar Mor, an AI and machine learning engineer and the founder of Stealth Co. in San Francisco. According to a recent report in AnalyticsIndiaMagMor did not learn AI technology at a university, but as a member of the Israeli intelligence unit – 8200.
“I was one of the first engineers in the AI community to have access to OpenAI’s GPT-3 model, ”said Mor. He used technology to build AirPaper, an automated document extraction API that launched last September.
The website attracts potential customers with “Reduce your operational workload” and “No more manual data entry. Extracts what’s important and removes your people in the loop. “
The first 100 pages are free and then move on a subscription basis. “Send any document, either a PDF or an image, and get structured data,” Mor explained.
To gain access, Mor sent the CTO of OpenAI an email with a brief background about himself and the app he had envisioned. Part of the process of getting approval is writing what he learns about the flaws in the model and possible ways to mitigate those flaws. Once the application has been submitted, one has to wait. “The current waiting times can be forever,” and developers who applied in late June are still waiting for a response in mid-March.
The development started with OpenAIs Playground tool to iterate and see if it solves your GPT-3 problem. “This tinkering is key to developing the intuition needed to create successful prompts.” Mor specified. He saw an opportunity for OpenAI to better automate this phase which he proposed and which was implemented with theirs a few months later Instruction model series.
Next, happy with a command prompt template, he built it into his code. He pre-processed each document and turned his OCT into a “GPT-3 digestible prompt” that he used to query the API. After further testing and optimizing the parameters, he made the app available.
When asked what challenges he faced in training large language models, Mor cited “a lack of data relevant to the task at hand,” which is document processing. A number of commercial companies have Document Intelligence APIs, but not as open source software. Mor now build one that he calls DocumNetand calls it “an ImageNet equivalent for documents”.
Multimodal functions that combine natural language and coming images
In January, OpenAI released DALL-E, an AI program that creates images from text descriptions. It uses a 12 billion parameter version of the GPT-3 transformer model to incorporate natural language inputs and generate corresponding images Wikipedia. OpenAI was also recently released CLIP, A neural network that learns visual concepts from monitoring natural language.
When asked whether he sees these AI “fusion models” or multimodal systems that combine text and images as the future of AI research, Mor said, “Definitely.” He gave an example of deep learning Provides a model for the early detection of cancer using images, which will perform poorly if not combined with text on patient charts from electronic health records.
“The main reason multimodal systems are not common in AI research is because of their lack of bias in data sets. This can be solved with more data that is becoming increasingly available, ”explained Mor. Multimodal applications are not limited to just vision plus speech, but could also extend to vision plus speech plus audio, he suggested.
When asked if he believes GPT-3 should be regulated in the future, Mor said yes, but it’s difficult. OpenAI is self-regulating and showing that they recognize the harmful potential of its technology. “And if so, can we trust a trading company to regulate itself without a trained supervisory authority? What happens when such a company compromises ethics and revenue? “Mor wondered.
How SEO professional in Australia got GPT-3 access
A search engine optimization expert in Australia recently got access to GPT-3 and wrote about the experience on the blog for his company. Digital up.
Founder Ashar Jamil became interested in GPT-3 when he read an article in The guardian that the newspaper said was written by a robot. ” I was thrilled to be using GPT-3 to help the people in the SEO industry, ”said Jamil, whose company provides digital marketing and social media services.
He completed that OpenAI Waiting list access form, detailed the purpose and details of his project and waited. After a week of getting impatient, he decided to step up his efforts. He bought a “fancy domain” for his intended project, designed a demo landing page with a small animation, tweeted a video about the project and tagged it OpenAI Chairperson. After 10 minutes he got a reply from him and asked for his email.
“After only 10 minutes, I got an answer from him and asked for my email. And boom, I have access, ”Jamil explained.
A slightly different approach to studying GPT-3 was recently tried by researchers at Stanford University’s Human-Centered AI Lab. A report was published at HAI. A group of scientists from the fields of computer science, linguistics and philosophy was convened in a workshop “Chatham House Rule” in which none of the participants can be identified by name. The theory is that this can lead to a freer discussion.
The participants worked on two questions: wWhat are the technical capabilities and limitations of large language models? And what are the social effects of the widespread use of large language models?
Among the discussion points:
Because GPT-3 has a wide variety of functions “including text summary, chatbots, search and code generation “it is difficult to characterize all possible uses and abuses.
In addition, it is “unclear what impact high performing models will have on the labor market. This begs the question of when (or which) jobs could (or should) be automated by large language models, ”says the HAI summary.
One more comment: “Some participants said that GPT-3 has no intentions, goals, and the ability to understand cause and effect – all hallmarks of human knowledge. “
Likewise, “GPT-3 can exhibit undesirable behavior, including known racial, gender, and religious biases,” the abstract reads. Some discussions followed about how to respond. Finally: “The participants agreed that there is no such thing as a silver bullet, and further interdisciplinary research is required as to what values we should give these models and how this can be achieved.”
Everyone agreed that there is an urgent need to establish standards and guidelines for the use of large language models such as GPT-3.