![]() ![]() LLMs seem to be able to do both plan generation and evaluation. ![]() Explicit search processes can’t generate and evaluate plans in the real world. Why: Vision systems are bad at coming up with and evaluating deceptive plans.But I expect LLMs to do plan generation and evaluation, which are the core of the system (from an Alignment point of view). AGI might rely on vision models for some tasks and interactions with the world, and it might use explicit search processes like AlphaGo. The First AGIs Will Have LLMs at Their Coreīy “first AGIs” I mean the first systems able to automate all cognitive tasks.ĪGI is likely to do reasoning and planning using LLMs. But I think it’s not a large part of all possible futures (20% conditioning on AGI before 2030). I argue that this world is likely, and thus a subset of all possible futures to care about. Here, I sketch a world in which the first AGIs have certain properties. The Open Agency Model and Factored Cognition which describe subsets of AIs with Translucent Thoughts, which might be safe.Conditioning Predictive Models, which makes assumptions slightly different from the Translucent Thoughts hypotheses, yielding different research directions.Externalized Reasoning Oversight, which describes a class of solutions similar to the one outlined here, but also aims for additional properties which I argue can be replaced with a less stringent hypothesis about AI systems.Despite this, the ideas of this post are close to some other works describing paths to safe AGIs, such as: This is different from most other works in this space, which often directly describe a kind of safe AGI. In this post, I argue that we may will in a world where the first AGIs will look like X, and I then describe ways to make the first AGIs safer given X. I think those deserve attention, because Translucent Thoughts AIs are not safe by default. If these hypotheses are true, it should lead us to prioritize underexplored research directions, such as circumventing steganography or building extremely reliable text-supervision methods. Getting this increase through other means is likely to be hard and not competitive. Text pretraining and slight fine-tuning makes model able to use text generation to increase the maximum number of serial steps by a huge factor. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |