Human benchmark meaning

9/1/2023

These systems rely on a large amount of diverse data to achieve impressive AI performance with huge-size models. External attention: The benefits of looking outwardĮxternal attention is complementary to self-attention, which has been widely adopted by many of today’s AI systems, such as those using Transformers (opens in new tab). The proposed model, Translate-Retrieve-Translate (TRT), achieved first place on both the X-CODAH and X-CSQA datasets on the X-CSR benchmark. In applying external attention to multilingual commonsense reasoning, we translate a non-English question into English, retrieve the knowledge from various sources, and translate the knowledge text into the source language for external attention. After concatenating the retrieved knowledge with the input, KEAR feeds it into the DeBERTa model, which selects the answer “bone.” In this way, the KEAR model can attend to related external knowledge for effective commonsense understanding.įor example, for the aforementioned question-“What is a treat that your dog will enjoy?”-KEAR retrieves “Dog - desires - petted, affection, bone, lots of attention” from the knowledge graph ConceptNet (note that the choice “salad,” offered as one of the five options, doesn’t appear in the retrieved results) “Bone: a composite material making up the skeleton of most vertebrates” from the dictionary Wiktionary and “What do dogs like to eat? bones” from the training data in the CommonsenseQA dataset. The final submission is generated by an ensemble of 39 language models, such as DeBERTa and ELECTRA (opens in new tab), with majority voting. The candidate answer with the highest score is chosen as the output. Then, the retrieved knowledge is concatenated with the input question and candidate answer and fed into a language model to produce a score. Given a question and five candidate answers, for the CommonsenseQA task, the KEAR model first retrieves related knowledge from a knowledge graph via entity linking, from a dictionary via word matching, and from related QA datasets via text retrieval. With KEAR, we specifically equip language models with commonsense knowledge from a knowledge graph, dictionary, and publicly available machine learning data. We revisit the rules and knowledge approach and find that deep learning models and knowledge can be organically combined via an external attention mechanism to achieve breakthroughs in AI. On the other hand, expert systems with lots of rules and domain knowledge and little data have failed to deliver their promise of AI that understands and reasons more like people do. For example, the DeBERTa language model selects “lots of attention,” which is not as good an answer as “bone.”

Thus, the best answer would be “bone.” Without this external knowledge, even large-scale models may generate incorrect answers. For example, take the question “What is a treat that your dog will enjoy?” To select an answer from the choices salad, petted, affection, bone, and lots of attention, we need to know that dogs generally enjoy food such as bones for a treat. Commonsense knowledge is often absent from task input but is crucial for language understanding. While the CommonsenseQA benchmark is in English, we follow a similar technique for multilingual commonsense reasoning and topped the X-CSR (opens in new tab) leaderboard.Īlthough recent large deep learning models trained with big data have made significant breakthroughs in natural language understanding, they still struggle with commonsense knowledge about the world, information that we, as people, have gathered in our day-to-day lives over time. KEAR reaches an accuracy of 89.4 percent on the CommonsenseQA (opens in new tab) leaderboard compared with 88.9 percent human accuracy. When given a question that requires drawing on prior knowledge and five answer choices, our latest model- KEAR, Knowledgeable External Attention for commonsense Reasoning (opens in new tab)-performs better than people answering the same question, calculated as the majority vote among five individuals. Last month, our Azure Cognitive Services (opens in new tab) team, comprising researchers and engineers with expertise in AI, achieved a groundbreaking milestone by advancing commonsense language understanding. For more information about these efforts, read the XYZ-code blog post.

At the center of these efforts is XYZ-code, a joint representation of three cognitive attributes: monolingual text (X), audio or visual sensory signals (Y), and multilingual (Z). KEAR (Knowledgeable External Attention for commonsense Reasoning)-along with recent milestones in computer vision and neural text-to-speech-is part of a larger Azure AI mission to provide relevant, meaningful AI solutions and services that work better for people because they better capture how people learn and work-with improved vision, knowledge understanding, and speech capabilities.

0 Comments

Human benchmark meaning

Leave a Reply.

Author

Archives

Categories