12 days of OpenAI: The Ars Technica recap

Day 9: Tuesday, December 17

On day 9, OpenAI released its o1 model through its API platform, adding support for function calling, developer messages, and vision processing capabilities. The company also reduced GPT-4o audio pricing by 60 percent and introduced a GPT-4o mini option that costs one-tenth of previous audio rates.

OpenAI also simplified its WebRTC integration for real-time applications and unveiled Preference Fine-Tuning, which provides developers new ways to customize models. The company also launched beta versions of software development kits for the Go and Java programming languages, expanding its toolkit for developers.

Day 10: Wednesday, December 18

On Wednesday, OpenAI did something a little fun and launched voice and messaging access to ChatGPT through a toll-free number (1-800-CHATGPT), as well as WhatsApp. US residents can make phone calls with a 15-minute monthly limit, while global users can message ChatGPT through WhatsApp at the same number.

OpenAI said the release is a way to reach users who lack consistent high-speed Internet access or want to try AI through familiar communication channels, but it’s also just a clever hack. As evidence, OpenAI notes that these new interfaces serve as experimental access points, with more “limited functionality” than the full ChatGPT service, and still recommends existing users continue using their regular ChatGPT accounts for complete features.

Day 11: Thursday, December 19

On Thursday, OpenAI expanded ChatGPT’s desktop app integration to include additional coding environments and productivity software. The update added support for Jetbrains IDEs like PyCharm and IntelliJ IDEA, VS Code variants including Cursor and VSCodium, and text editors such as BBEdit and TextMate.

OpenAI also included integration with Apple Notes, Notion, and Quip while adding Advanced Voice Mode compatibility when working with desktop applications. These features require manual activation for each app and remain available to paid subscribers, including Plus, Pro, Team, Enterprise, and Education users, with Enterprise and Education customers needing administrator approval to enable the functionality.

https://arstechnica.com/information-technology/2024/12/12-days-of-openai-the-ars-technica-recap/

OpenAI announces o3 and o3-mini, its next simulated reasoning models

On Friday, during Day 12 of its “12 days of OpenAI,” OpenAI CEO Sam Altman announced its latest AI “reasoning” models, o3 and o3-mini, which build upon the o1 models launched earlier this year. The company is not releasing them yet but will make these models available for public safety testing and research access today.

The models use what OpenAI calls “private chain of thought,” where the model pauses to examine its internal dialog and plan ahead before responding, which you might call “simulated reasoning” (SR)—a form of AI that goes beyond basic large language models (LLMs).

The company named the model family “o3” instead of “o2” to avoid potential trademark conflicts with British telecom provider O2, according to The Information. During Friday’s livestream, Altman acknowledged his company’s naming foibles, saying, “In the grand tradition of OpenAI being really, truly bad at names, it’ll be called o3.”

According to OpenAI, the o3 model earned a record-breaking score on the ARC-AGI benchmark, a visual reasoning benchmark that has gone unbeaten since its creation in 2019. In low-compute scenarios, o3 scored 75.7 percent, while in high-compute testing, it reached 87.5 percent—comparable to human performance at an 85 percent threshold.

OpenAI also reported that o3 scored 96.7 percent on the 2024 American Invitational Mathematics Exam, missing just one question. The model also reached 87.7 percent on GPQA Diamond, which contains graduate-level biology, physics, and chemistry questions. On the Frontier Math benchmark by EpochAI, o3 solved 25.2 percent of problems, while no other model has exceeded 2 percent.

https://arstechnica.com/information-technology/2024/12/openai-announces-o3-and-o3-mini-its-next-simulated-reasoning-models/

Google will apparently offer “AI Mode” right on its main search page

Google will soon take more steps to make AI a part of search, exposing more users to its Gemini agent, according to recent reports and app teardowns.

“AI Mode,” shown at the top left of the web results page and inside the Google app, will provide an interface similar to a Gemini AI chat, according to The Information.

This tracks with a finding from Android Authority earlier this month, which noted a dedicated “AI mode” button inside an early beta of the Google app. This shortcut also appeared on Google’s Android search widget, and a conversation history button was added to the Google app. Going even deeper into the app, 9to5Google found references to “aim” (AI mode) and “ai_mode” which suggest a dedicated tab in the Google app, with buttons for speaking to an AI or sending it pictures.

Google already promotes Gemini with links below its search homepage. (“5 ways Gemini can help during the Holidays” is currently showing for me.) Search results on Google can also contain an “AI Overview,” which launched with some “use glue for pizza sauce” notoriety. People averse to AI answers can avoid them with URL parameters and proxy sites (or sticking to the “web” tab). Gemini has also been prominently added to other Google products, like Pixel phones, Gmail, and Drive/Workspace. And the search giant has also been testing the ability to attach files to a web search for analysis.

https://arstechnica.com/gadgets/2024/12/google-will-apparently-offer-ai-mode-right-on-its-main-search-page/

The AI war between Google and OpenAI has never been more heated

Over the past month, we’ve seen a rapid cadence of notable AI-related announcements and releases from both Google and OpenAI, and it’s been making the AI community’s head spin. It has also poured fuel on the fire of the OpenAI-Google rivalry, an accelerating game of one-upmanship taking place unusually close to the Christmas holiday.

“How are people surviving with the firehose of AI updates that are coming out,” wrote one user on X last Friday, which is still a hotbed of AI-related conversation. “in the last <24 hours we got gemini flash 2.0 and chatGPT with screenshare, deep research, pika 2, sora, chatGPT projects, anthropic clio, wtf it never ends.”

Rumors travel quickly in the AI world, and people in the AI industry had been expecting OpenAI to ship some major products in December. Once OpenAI announced “12 days of OpenAI” earlier this month, Google jumped into gear and seemingly decided to try to one-up its rival on several counts. So far, the strategy appears to be working, but it’s coming at the cost of the rest of the world being able to absorb the implications of the new releases.

“12 Days of OpenAI has turned into like 50 new @GoogleAI releases,” wrote another X user on Monday. “This past week, OpenAI & Google have been releasing at the speed of a new born startup,” wrote a third X user on Tuesday. “Even their own users can’t keep up. Crazy time we’re living in.”

“Somebody told Google that they could just do things,” wrote a16z partner and AI influencer Justine Moore on X, referring to a common motivational meme telling people they “can just do stuff.”

The Google AI rush

OpenAI’s “12 Days of OpenAI” campaign has included releases of their full o1 model, an upgrade from o1-preview, alongside o1-pro for advanced “reasoning” tasks. The company also publicly launched Sora for video generation, added Projects functionality to ChatGPT, introduced Advanced Voice features with video streaming capabilities, and more.

https://arstechnica.com/information-technology/2024/12/google-and-openai-blitz-december-with-so-many-ai-releases-its-hard-to-keep-up/

Why AI language models choke on too much text

This means that the total computing power required for attention grows quadratically with the total number of tokens. Suppose a 10-token prompt requires 414,720 attention operations. Then:

Processing a 100-token prompt will require 45.6 million attention operations.
Processing a 1,000-token prompt will require 4.6 billion attention operations.
Processing a 10,000-token prompt will require 460 billion attention operations.

This is probably why Google charges twice as much, per token, for Gemini 1.5 Pro once the context gets longer than 128,000 tokens. Generating token number 128,001 requires comparisons with all 128,000 previous tokens, making it significantly more expensive than producing the first or 10th or 100th token.

A lot of effort has been put into optimizing attention. One line of research has tried to squeeze maximum efficiency out of individual GPUs.

As we saw earlier, a modern GPU contains thousands of execution units. Before a GPU can start doing math, it must move data from slow shared memory (called high-bandwidth memory) to much faster memory inside a particular execution unit (called SRAM). Sometimes GPUs spend more time moving data around than performing calculations.

In a series of papers, Princeton computer scientist Tri Dao and several collaborators have developed FlashAttention, which calculates attention in a way that minimizes the number of these slow memory operations. Work like Dao’s has dramatically improved the performance of transformers on modern GPUs.

Another line of research has focused on efficiently scaling attention across multiple GPUs. One widely cited paper describes ring attention, which divides input tokens into blocks and assigns each block to a different GPU. It’s called ring attention because GPUs are organized into a conceptual ring, with each GPU passing data to its neighbor.

I once attended a ballroom dancing class where couples stood in a ring around the edge of the room. After each dance, women would stay where they were while men would rotate to the next woman. Over time, every man got a chance to dance with every woman. Ring attention works on the same principle. The “women” are query vectors (describing what each token is “looking for”) and the “men” are key vectors (describing the characteristics each token has). As the key vectors rotate through a sequence of GPUs, they get multiplied by every query vector in turn.

https://arstechnica.com/ai/2024/12/why-ai-language-models-choke-on-too-much-text/

Stangata del Garante Privacy a OpenAI, multa da 15 milioni di euro e campagna informativa per ChatGPT

Le violazioni contestate a OpenAI e le sanzioni del Garante Privacy

Il Garante per la protezione dei dati personali ha inflitto una sanzione di 15 milioni di euro a OpenAI, la società che gestisce il chatbot di intelligenza artificiale generativa ChatGPT. La decisione arriva al termine di un’istruttoria avviata nel marzo 2023 e in seguito al parere dell’EDPB (Comitato europeo per la protezione dei dati) che ha delineato un approccio comune alle questioni relative al trattamento dei dati personali nell’ambito dell’IA.

Il Garante ha accertato diverse violazioni da parte di OpenAI:

mancata notifica di data breach, cioè la società non ha notificato all’Autorità la violazione dei dati subita nel marzo 2023;
trattamento illecito dei dati per l’addestramento dell’IA, perchè OpenAI ha utilizzato i dati personali degli utenti per addestrare ChatGPT senza una base giuridica adeguata, violando il principio di trasparenza e gli obblighi informativi verso gli utenti;
assenza di meccanismi di verifica dell’età, cioèassenza di controlli sull’età espone i minori di 13 anni a contenuti potenzialmente inappropriati.

Come ha ricordato il presidente dell’EDPB Talus: “Le tecnologie di IA possono offrire molte opportunità e vantaggi a diversi settori e ambiti della vita. Dobbiamo garantire che queste innovazioni siano fatte in modo etico, sicuro e in un modo che vada a beneficio di tutti. L’EDPB intende sostenere l’innovazione responsabile in materia di IA garantendo la protezione dei dati personali e nel pieno rispetto del regolamento generale sulla protezione dei dati (RGPD)“.

Le misure impartite

Per garantire la trasparenza nel trattamento dei dati personali, il Garante ha imposto a OpenAI, avvalendosi per la prima volta dei nuovi poteri previsti dall’articolo 166 comma 7 del Codice Privacy, di realizzare una campagna di comunicazione istituzionale di sei mesi su radio, televisione, giornali e internet.

La campagna, i cui contenuti dovranno essere concordati con l’Autorità, avrà l’obiettivo di:

informare il pubblico sul funzionamento di ChatGPT, con particolare attenzione alla raccolta dei dati di utenti e non utenti per l’addestramento dell’IA generativa;
spiegare i diritti degli interessati, tra cui opposizione, rettifica e cancellazione dei dati;
sensibilizzare gli utenti e non utenti di ChatGPT su come opporsi all’utilizzo dei propri dati personali per l’addestramento dell’IA, mettendoli nelle condizioni di esercitare i propri diritti ai sensi del GDPR.

La sanzione e il ruolo dell’Autorità irlandese

Oltre alla campagna informativa, il Garante ha comminato a OpenAI una sanzione di 15 milioni di euro, tenendo conto dell’atteggiamento collaborativo della società durante l’istruttoria.

Poiché OpenAI ha stabilito in Irlanda il suo quartier generale europeo, il Garante, in conformità con il principio del “one-stop-shop” previsto dal GDPR, ha trasmesso gli atti del procedimento all’Autorità di protezione dati irlandese (DPC), che diventa l’autorità di controllo capofila. La DPC proseguirà l’istruttoria per eventuali violazioni continuative non esauritesi prima dell’apertura della sede europea.

Non solo ChatGPT, ma un precedente importante in generale per l’AI generativa

Il provvedimento del Garante Privacy rappresenta un importante precedente in materia di protezione dei dati personali nell’ambito dell’intelligenza artificiale generativa.

L’attenzione si concentra sulla trasparenza, sul consenso informato e sulla tutela dei minori, con l’obiettivo di garantire un utilizzo responsabile e consapevole di queste nuove tecnologie. La collaborazione tra le autorità europee, in particolare tra il Garante italiano e la DPC irlandese, sottolinea l’importanza di un approccio coordinato a livello europeo per la gestione delle sfide poste dall’IA.

La posizione di OpenAI

La risposta di OpenAI non si è fatta attendere, si legge in una nota ufficiale che “la decisione del Garante non è proporzionata e presenteremo ricorso. Quando il Garante ci ha ordinato di sospendere ChatGPT in Italia nel 2023, abbiamo collaborato con l’Autorità per renderlo nuovamente disponibile un mese dopo“.

“Già allora il Garante aveva riconosciuto il nostro ruolo da capofila per quanto riguarda la protezione dei dati nell’ambito dell’IA – continua il documento – e questa sanzione rappresenta circa venti volte il fatturato da noi generato in Italia nello stesso periodo. Riteniamo che l’approccio del Garante comprometta le ambizioni dell’Italia in materia di IA, ma rimaniamo impegnati a collaborare con le autorità preposte alla tutela della privacy in tutto il mondo per offrire un’IA capace di portare benefici alla società nel rispetto dei diritti della privacy”.

Questa indagine si concentra sul periodo novembre 2022 – marzo 2023, è specificato nella nota. Dal rilascio di ChatGPT nel novembre 2022, OpenAI ha reso ancora più facile per gli utenti l’accesso ai nostri strumenti per i dati, inserendoli nelle impostazioni di ChatGPT.

La società americana fa sapere che è stato lanciato anche un Privacy Center, all’indirizzo privacy.openai.com, dove gli utenti possono esercitare le loro preferenze sulla privacy e scegliere di non utilizzare i loro dati per l’addestramento dell’IA.

Leggi le altre notizie sull’home page di Key4biz

https://www.key4biz.it/stangata-del-garante-privacy-a-openai-multa-da-15-milioni-di-euro-e-campagna-informativa-per-chatgpt/516406/

Not to be outdone by OpenAI, Google releases its own “reasoning” AI model

Google DeepMind’s chief scientist, Jeff Dean, says that the model receives extra computing power, writing on X, “we see promising results when we increase inference time computation!” The model works by pausing to consider multiple related prompts before providing what it determines to be the most accurate answer.

Since OpenAI’s jump into the “reasoning” field in September with o1-preview and o1-mini, several companies have been rushing to achieve feature parity with their own models. For example, DeepSeek launched DeepSeek-R1 in early November, while Alibaba’s Qwen team released its own “reasoning” model, QwQ earlier this month.

While some claim that reasoning models can help solve complex mathematical or academic problems, these models might not be for everybody. While they perform well on some benchmarks, questions remain about their actual usefulness and accuracy. Also, the high computing costs needed to run reasoning models have created some rumblings about their long-term viability. That high cost is why OpenAI’s ChatGPT Pro costs $200 a month, for example.

Still, it appears Google is serious about pursuing this particular AI technique. Logan Kilpatrick, a Google employee in its AI Studio, called it “the first step in our reasoning journey” in a post on X.

https://arstechnica.com/information-technology/2024/12/not-to-be-outdone-by-openai-google-releases-its-own-reasoning-ai-model/

New physics sim trains robots 430,000 times faster than reality

The AI-generated worlds reportedly include realistic physics, camera movements, and object behaviors, all from text commands. The system then creates physically accurate ray-traced videos and data that robots can use for training.

Examples of “4D dynamical and physical” worlds that Genesis created from text prompts.

This prompt-based system lets researchers create complex robot testing environments by typing natural language commands instead of programming them by hand. “Traditionally, simulators require a huge amount of manual effort from artists: 3D assets, textures, scene layouts, etc. But every component in the workflow can be automated,” wrote Fan.

Using its engine, Genesis can also generate character motion, interactive 3D scenes, facial animation, and more, which may allow for the creation of artistic assets for creative projects, but may also lead to more realistic AI-generated games and videos in the future, constructing a simulated world in data instead of operating on the statistical appearance of pixels as with a video synthesis diffusion model.

Examples of character motion generation from Genesis, using a prompt that includes, “A miniature Wukong holding a stick in his hand sprints across a table surface for 3 seconds, then jumps into the air, and swings his right arm downward during landing.”

While the generative system isn’t yet part of the currently available code on GitHub, the team plans to release it in the future.

Training tomorrow’s robots today (using Python)

Genesis remains under active development on GitHub, where the team accepts community contributions.

The platform stands out from other 3D world simulators for robotic training by using Python for both its user interface and core physics engine. Other engines use C++ or CUDA for their underlying calculations while wrapping them in Python APIs. Genesis takes a Python-first approach.

Notably, the non-proprietary nature of the Genesis platform makes high-speed robot training simulations available to any researcher for free through simple Python commands that work on regular computers with off-the-shelf hardware.

Previously, running robot simulations required complex programming and specialized hardware, says Fan in his post announcing Genesis, and that shouldn’t be the case. “Robotics should be a moonshot initiative owned by all of humanity,” he wrote.

https://arstechnica.com/information-technology/2024/12/new-physics-sim-trains-robots-430000-times-faster-than-reality/

AI e futuro della ricerca online, Perplexity triplica il suo valore e sale a 9 miliardi di dollari

Valutazione record e investimenti significativi per Perplexity AI

Trasformare il mondo della ricerca online grazie all’intelligenza artificiale è l’obiettivo di Perplexity AI, startup fondata nel 2022 da un team di ex ricercatori di OpenAI e Meta, tra cui Aravind Srinivas, Andy Konwinski, Denis Yarats e Johnny Ho.

La soluzione proposta si distingue per l’elevata capacità di comprendere il linguaggio naturale e il contesto delle query.

Perplexity AI ha inoltre chiuso un importante round di finanziamenti da 500 milioni di dollari. Questo investimento, guidato da Institutional Venture Partners, ha triplicato la valutazione dell’azienda, che ora raggiunge i 9 miliardi di dollari rispetto ai 3 miliardi di giugno 2024.

Una crescita vertiginosa, nove volte superiore rispetto a inizio anno, che riflette il grande interesse del mercato per le tecnologie di intelligenza artificiale (AI) più innovative e il loro impatto potenziale sul settore della ricerca.

Un approccio innovativo alla ricerca

Ciò che distingue Perplexity è il suo focus sulla ricerca in tempo reale. Diversamente dai chatbot basati esclusivamente su dati pre-addestrati, la piattaforma sfrutta un modello AI che scansiona attivamente il web per fornire risposte aggiornate in un formato conversazionale simile a ChatGPT.

L’azienda offre una versione gratuita e una premium, con funzionalità aggiuntive come la ricerca interna per le organizzazioni e strumenti dedicati alla finanza. A marzo 2024, Perplexity vantava oltre 15 milioni di utenti attivi e centinaia di milioni di query mensili.

Acquisizioni strategiche

Perplexity ha recentemente acquisito Carbon, una startup specializzata nella tecnologia di generazione potenziata dal recupero (Retrieval-Augmented Generation, RAG).

Questa acquisizione punta a migliorare le capacità di ricerca nelle applicazioni aziendali e di produttività, rafforzando la posizione della società nel soddisfare le esigenze degli utenti aziendali.

Sfide nel mercato e strategie di crescita di Perplexity AI

Il panorama competitivo vede Perplexity confrontarsi con colossi come OpenAI, che ha integrato la ricerca web in ChatGPT, oltre a Google e Microsoft, che stanno incorporando modelli linguistici avanzati nei loro motori di ricerca.

Perplexity, dal canto suo, sta esplorando la monetizzazione attraverso pubblicità e collaborazioni con grandi editori come Time e Fortune, per condividere i ricavi e affrontare le sfide legate ai contenuti.

Una scommessa degli investitori sull’intelligenza artificiale

Il massiccio investimento in Perplexity riflette una tendenza più ampia: nel 2024, il 42% del capitale di rischio statunitense è stato destinato a startup AI, in crescita rispetto al 36% del 2023.

Questa fiducia sottolinea il potenziale trasformativo dell’intelligenza artificiale, in particolare nel settore della ricerca online e delle soluzioni aziendali.

Con una valutazione record, un finanziamento consistente e l’acquisizione di tecnologie strategiche, Perplexity AI sembra ben posizionata per sfidare i giganti del settore e definire il futuro della ricerca online. Mentre continua a innovare, resta da vedere come l’azienda navigherà un panorama competitivo sempre più affollato e capitalizzerà sul suo potenziale tecnologico.

Il futuro dei motori di ricerca basati sull’AI

L’adozione di motori di ricerca basati su AI, come Perplexity e ChatGPT, sta guadagnando slancio. Questi strumenti offrono risposte più dirette e contestualizzate rispetto ai tradizionali risultati di ricerca basati su link, migliorando l’esperienza utente.

Secondo Bloomberg, si stima che tali strumenti possano ridurre significativamente il traffico su Google, che attualmente detiene oltre il 90% del mercato della ricerca online.

C’è un crescente livello di insoddisfazione nei confronti della qualità dei risultati forniti da Google. Questo ha portato a una maggiore apertura verso nuove tecnologie che promettono un’esperienza di ricerca migliore.

Le soluzioni AI possono rispondere a domande specifiche con informazioni più pertinenti e affidabili, contribuendo a soddisfare le esigenze degli utenti. La competizione potrebbe spingere Google a migliorare la qualità dei suoi servizi, rendendo la ricerca online più efficiente.

Il mercato globale dei sistemi di ricerca online alimentati dall’AI crescerà significativamente nei prossimi anni e stando alle più recenti stime, potrebbe raggiungere i 12,9 miliardi di dollari entro il 2031.

Leggi le altre notizie sull’home page di Key4biz

https://www.key4biz.it/ai-e-futuro-della-ricerca-online-perplexity-triplica-il-suo-valore-e-sale-a-9-miliardi-di-dollari/516260/

Call ChatGPT from any phone with OpenAI’s new 1-800 voice service

On Wednesday, OpenAI launched a 1-800-CHATGPT (1-800-242-8478) telephone number that anyone in the US can call to talk to ChatGPT via voice chat for up to 15 minutes for free. The company also says that people outside the US can send text messages to the same number for free using WhatsApp.

Upon calling, users hear a voice say, “Hello again, it’s ChatGPT, an AI assistant. Our conversation may be reviewed for safety. How can I help you?” Callers can ask ChatGPT anything they would normally ask the AI assistant and have a live, interactive conversation.

During a livestream demo of “Calling with ChatGPT” during Day 10 of “12 Days of OpenAI,” OpenAI employees demonstrated several examples of the telephone-based voice chat in action, asking ChatGPT to identify a distinctive house in California and for help in translating a message into Spanish for a friend. For fun, they showed calls from an iPhone, a flip phone, and a vintage rotary phone.

OpenAI developers demonstrate calling 1-800-CHATGPT during a livestream on December 18, 2024. Credit: OpenAI

OpenAI says the new features came out of an internal OpenAI “hack week” project that a team built just a few weeks ago. The company says its goal is to make ChatGPT more accessible if someone does not have a smartphone or a computer handy.

During the livestream, an OpenAI employee mentioned that 15 minutes of voice chatting are free and that you can download the app and create an account to get more. While the audio chat version seems to be running a full version of GPT-4o on the back end, a developer during the livestream said the free WhatsApp text mode is using GPT-4o mini.

https://arstechnica.com/information-technology/2024/12/openai-launches-free-phone-hotline-to-let-anyone-call-chatgpt/