Business leaders agree AI is the future. They just wish it worked right now.

2025-12-16
14 min read
2709 words


🎧 Voice Briefing

📅 Generated: 2025. 12. 16. 오후 10:24:57
⏱️ Duration: ~69s

Tour of OpenAI data center in Abilene, Texas

OpenAI CEO Sam Altman speaks to media following a Q&A at the OpenAI data center in Abilene, Texas, U.S., September 23, 2025. REUTERS/Shelby Tauber/Pool/File Photo Purchase Licensing Rights, opens new tab

SAN FRANCISCO/STOCKHOLM, Dec 16 - Last spring, CellarTracker, a wine-collection app, built an AI-powered sommelier to make unvarnished wine recommendations based on a person’s palate. The problem was the chatbot was too nice.

“It's just very polite, instead of just saying, ‘It's really unlikely you'll like the wine,’” CellarTracker CEO Eric LeVine said. It took six weeks of trial and error to coax the chatbot into offering an honest appraisal before the feature was launched.

Since ChatGPT exploded three years ago, companies big and small have leapt at the chance to adopt generative artificial intelligence and stuff it into as many products as possible. But so far, the vast majority of businesses are struggling to realize a meaningful return on their AI investments, according to company executives, advisors and the results of seven recent executive and worker surveys.

One survey of 1,576 executives conducted during the second quarter by research and advisory firm Forrester Research showed just 15% of respondents saw profit margins improve due to AI over the last year. Consulting firm BCG found that only 5% of 1,250 executives surveyed between May and mid-July saw widespread value from AI.

Executives say they still believe generative AI will eventually transform their businesses, but they are reconsidering how quickly that will happen within their organizations. Forrester predicts that in 2026 companies will delay about 25% of their planned AI spending by a year.

“The tech companies who have built this technology have spun this tale that this is all going to change quickly,” Forrester analyst Brian Hopkins said. “But we humans don’t change that fast.”

AI companies including OpenAI, Anthropic and Google are all doubling down on courting business customers in the next year. During a recent lunch with media editors in New York, OpenAI CEO Sam Altman said developing AI systems for companies could be a $100 billion market.

All this is happening against the backdrop of unprecedented tech investment in everything from chips, to data centers, to energy sources.

Whether these investments can be justified will be determined by companies’ ability to figure out how to use AI to boost revenue, fatten margins or speed innovation. Failing that, the infrastructure build-out could trigger the kind of crash reminiscent of the dot-com bust in the early 2000s, some experts say.

THE 'EASY' BUTTON

Soon after ChatGPT’s launch, companies worldwide created task forces dedicated to finding ways to embrace generative AI, a type of AI that can create original content like essays, software code and images through text prompts.

One well-known issue with AI models is their tendency to please the user. This bias – what’s called “sycophancy” – encourages users to chat more, but can impair the model’s ability to give better advice.

CellarTracker ran into this problem with its wine-recommendation feature, built on top of OpenAI’s technology, CEO LeVine said. The chatbot performed well enough when asked for general recommendations. But when asked about specific vintages, the chatbot remained positive – even if all signals showed a person was highly unlikely to enjoy them.

“We had to bend over backwards to get the models (any model) to be critical and suggest there are wines I might not like,” LeVine said.

Part of the solution was designing prompts that gave the model permission to say no.

Companies have also struggled with AI’s lack of consistency.

Jeremy Nielsen, general manager at North American railroad service provider Cando Rail and Terminals, said the company recently tested an AI chatbot for employees to study internal safety reports and training materials.

But Cando ran into a surprising stumbling block: the models couldn’t consistently and correctly summarize the Canadian Rail Operating Rules, a roughly 100-page document that lays out the safety standards for the industry.

Sometimes the models forgot or misinterpreted the rules; other times they invented them from whole cloth. AI researchers say models often struggle to recall what appears in the middle of a long document.

Cando has dropped the project for now, but is testing other ideas. So far the company has spent $300,000 on developing AI products.

“We all thought it’d be the easy button,” Nielsen said. “And that’s just not what happened.”

HUMANS MAKE A COMEBACK

Human-staffed call centers and customer service were supposed to be heavily disrupted by AI, but companies quickly learned there are limits to the amount of human interaction that can be delegated to chatbots.

In early 2024, Swedish payments company Klarna rolled out an OpenAI-powered customer service agent that it said could do the work of 700 full-time customer service agents.

In 2025, however, CEO Sebastian Siemiathowski was forced to dial that back and acknowledge that some customers preferred to talk with humans.

Siemiathowski said AI is reliable on simple tasks and can now do the work of about 850 agents, but more complex issues quickly get referred to human agents.

For 2026, Klarna is focused on building its second-generation AI chatbot, which it hopes to ship soon, but human beings will remain a big part of the mix.

“If you want to stay customer-obsessed, you can't rely [entirely] on AI,” he said.

Similarly, U.S. telecommunications giant Verizon is leaning back into human customer service agents in 2026 after attempts to delegate calls to AI.

“I think 40% of consumers like the idea of still talking to a human, and they're frustrated that they can't get to a human agent,” said Ivan Berg, who leads Verizon’s AI-driven efforts to enhance service operations for business customers, in a Reuters interview this fall.

The company, which has about 2,000 frontline customer service agents, still uses AI to screen calls, get information on customers, and direct them to either self-service systems or to human agents.

Using AI to handle routine questions frees up agents to handle complex issues and try new things, such as making outbound calls and doing sales.

“Empathy is probably the key thing that's holding us from having AI agents talk to customers holistically right now,” Berg said.

Shashi Upadhyay, president of product, engineering and AI at customer-service platform Zendesk, says AI excels in three areas: writing, coding and chatting. Zendesk’s clients rely on generative AI to handle between 50% and 80% of their customer-support requests. But, he said, the idea that generative AI can do everything is “oversold.”

THE ‘JAGGED FRONTIER’

Large language models are rapidly conquering complex tasks in math and coding, but can still fail at comparatively trivial tasks. Researchers call this contradiction in capabilities the “jagged frontier” of AI.

“It might be a Ferrari in math but a donkey at putting things in your calendar,” said Anastasios Angelopoulos, the CEO and cofounder of LMArena, a popular benchmarking tool.

Seemingly small issues can unexpectedly trip up AI systems.

Many financial firms rely on data compiled from a broad range of sources, all of which can be formatted very differently. These differences might prompt an AI tool to “read patterns that don’t exist,” said Clark Shafer, director at advisory firm Alpha Financial Markets Consulting.

Many companies are now looking into the potentially expensive, lengthy and complex process of reformatting their data to take advantage of AI, Shafer said.

Dutch technology investment group Prosus says one of its in-house AI agents is meant to answer questions about its portfolio, similar to what the group’s data analysts on staff already do.

Theoretically, an employee could ask how often a Prosus-backed food-delivery firm was late to deliver sushi orders in Berlin last week.

But for now, the tool doesn’t always understand what neighborhoods are part of Berlin or what “last week” means, said Euro Beinat, head of AI for Prosus.

“People thought AI was magic. It's not magic,” Beinat said. “There's a lot of knowledge that needs to be encoded in these tools to work well.”

MORE HANDHOLDING

OpenAI is working on a new product for businesses and recently created internal teams, such as the Forward Deployed Engineering team, to work directly with clients to help them use OpenAI’s technology to tackle specific problems, a spokesperson said.

“Where we do see failure is people that jump in too big, they find that billion-dollar problem—that's going to take a few years,” said Ashley Kramer, OpenAI’s head of revenue, during an onstage interview at Reuters Momentum AI conference in November.

Specifically, OpenAI is working with companies to find areas where AI can have a “high impact but maybe low lift at first,” said Kramer.

Rival AI lab Anthropic, which draws 80% of its revenue from business customers, is hiring “applied AI” experts who will embed with companies.

For AI companies to succeed, they will have to view themselves as “partners and educators, rather than just deployers of technology,” said Mike Krieger, Anthropic’s head of product, in an interview earlier this year.

An increasing number of startups, many founded by former OpenAI employees, are developing AI tools for specific sectors such as financial services or legal. These founders say companies will benefit from specialized models more than general-purpose or consumer tools like ChatGPT.

It’s a playbook that Writer, a San Francisco–based AI application startup, has been adopting. The company, which is now building AI agents for finance and marketing teams at large firms such as Vanguard and Prudential, puts its engineers on calls directly with clients to understand their workflows and co-build the agents.

“Companies need more handholding in actually making AI tools useful for them,” said May Habib, CEO of Writer.

Reporting by Deepa Seetharaman and Krystal Hu in San Francisco and Supantha Mukherjee in Stockholm. Editing by Kenneth Li and Michael Learmonth.

🧠 Connected Insights

📅 Last analyzed: 2025. 12. 20. 오전 7:39:50
💰 Analysis cost: $0.0178

🔗 Related Notes

  • The Number of People Using

    • supports: 분석 노트가 비즈니스 리더들의 AI ROI 부족과 채택 지연을 강조하는 반면, 이 노트는 직장 내 AI 사용자가 감소하고 있음을 보여주어 AI 실용성 문제에 대한 공통된 회의적 시각을 지지함. 개념적 연관성: AI 비즈니스 채택의 실망스러운 현실.
    • Confidence: ████░ (85%)
  • 🔗 The fastest-growing AI chatbot now

    • related: 분석 노트에서 OpenAI, Anthropic, Google의 비즈니스 공략을 언급하나, 이 노트는 이들 외의 챗봇(Gemini, Grok 등)의 빠른 성장을 다루어 AI 시장 경쟁과 대형사 중심이 아닌 다각화 추세를 연결. 논리적 관계: AI 생태계의 광범위한 발전.
    • Confidence: ████░ (75%)
  • 📝 [[300-Creator/350-NewsInsight/![Microsoft_CEO_Satya_Nadella](httpsi.extremete_뉴스인사이트.md]]

    • examples: 분석 노트의 AI 비즈니스 적용 어려움을 보완하는 사례로, Microsoft의 Copilot과 agentic AI가 구체적 예시. 사례 관계: 대형 테크 기업의 AI 제품화 노력.
    • Confidence: ████░ (70%)
  • 📝 ![Google_CEO_Sundar_Pichai,_at_the_Sun_Valley_conf_뉴스인사이트

    • examples: Google의 Gemini 3.0과 Pichai 관련 내용이 분석 노트의 Google AI 공략을 구체화. 사례 관계: AI 리더들의 비즈니스 전략 예시.
    • Confidence: ████░ (70%)
  • 🔼 NVIDIA 젠슨 황 AI 인프라 총정리 3300조원 시장.AI 인프라 혁명에 대해 이야기합니다.

    • extends: 분석 노트의 칩/데이터센터/에너지 투자와 dot-com 버블 위험을 확장하며, NVIDIA의 AI 인프라 시장(3300조원) 전망이 인프라 빌드업의 규모와 위험을 논리적으로 연결.
    • Confidence: ████░ (80%)

📚 Knowledge Gaps

  • 🔴 AI 모델의 'sycophancy' 문제와 해결 사례

    • CellarTracker 사례처럼 AI의 과도한 친절함이 실용성을 떨어뜨리는 문제를 언급하나, 기술적 해결책(예: 프롬프트 엔지니어링, 파인튜닝)이나 성공 사례가 부족. 비즈니스 ROI 향상에 핵심적임.
    • Suggested resources: OpenAI의 Sycophancy 연구 논문, Anthropic의 Constitutional AI 가이드
  • 🔴 AI ROI 달성 성공 사례와 벤치마크

    • 대부분 실패 사례만 강조되나, Forrester/BCG 설문 외 실제 ROI 개선 기업(15% 또는 5% 그룹)의 구체적 전략이 없음. 미래 투자 결정에 필수.
    • Suggested resources: McKinsey AI ROI 보고서 2025, Gartner Magic Quadrant for Enterprise AI
  • 🟡 AI 인프라 투자 버블 위험과 대안

    • dot-com 비교가 있지만, 현재 AI 버블 방지 전략(예: 에너지 효율, 엣지 컴퓨팅)이나 역사적 교훈이 미흡. 2026 지연 예측과 연계.
    • Suggested resources: 'The AI Bubble' by Sequoia Capital, IEA AI Energy Demand Forecast

💡 AI Insights

이 노트는 AI 하이프와 현실 간 괴리를 강조하며, 비즈니스 리더들의 낙관 속 실망을 포착. 관련 노트들은 AI 채택 저하, 경쟁 심화, 인프라 확대를 보완하나, 전체적으로 ROI 실현 지연과 인프라 과투자 위험이 지식 네트워크의 핵심 테마로 부상. 깊이 있는 성공 사례 보강 필요.

🧠 Connected Insights

📅 Last analyzed: 2025. 12. 21. 오후 7:31:40
💰 Analysis cost: $0.0176

🔗 Related Notes

  • 🔗 오픈AI 연구원 고교 중퇴 후

    • related: 개인적 AI 학습 성공 사례(챗GPT로 머신러닝 배움)를 제시하나, 분석 노트의 비즈니스 ROI 미달성과 대조되어 AI 접근성 vs 실무 적용성의 개념적 연관성을 보임.
    • Confidence: ████░ (70%)
  • 🔼 소프트웨어의 미래 우리는 무엇을 만들어야할까?

    • extends: 소프트웨어 미래와 AI 에이전트 논의가 분석 노트의 AI 제품화 어려움(예: sommelier 챗봇 수정)을 확장하며, 비즈니스 적용 지연을 논리적으로 지지.
    • Confidence: ████░ (75%)
  • The shard moment of transition

    • supports: AI 전환의 'shard moment' 개념이 비즈니스 리더들의 AI 미래 신뢰 vs 현재 ROI 실패를 지지하며, 변화 속도 지연을 공통 테마로 연결.
    • Confidence: ████░ (72%)
  • 📝 머스크 AI에 필요한 것은 진실·아름다움·호기심

    • examples: 머스크의 '진실' 강조가 분석 노트의 AI sycophancy 문제(너무 polite한 챗봇) 예시를 구체화하며, AI 품질 개선 필요성을 예시로 연결.
    • Confidence: ████░ (75%)
  • 🔗 Better than the cheap alternative

    • related: AI 시대 인간 전문성 차별화가 비즈니스 AI 투자 ROI 미달성과 연관되어, 저비용 AI 대안의 한계를 공유.
    • Confidence: ████░ (70%)

📚 Knowledge Gaps

  • 🔴 AI 모델의 'sycophancy' 문제와 해결 사례

    • 분석 노트에서 CellarTracker 챗봇 예시처럼 AI의 과도한 polite함이 비즈니스 신뢰를 떨어뜨리며, 구체적 해결(예: fine-tuning 기법) 부재로 ROI 저해.
    • Suggested resources: Anthropic's Constitutional AI paper, OpenAI sycophancy mitigation research
  • 🔴 AI ROI 달성 성공 사례와 벤치마크

    • 설문 결과(Forrester 15%, BCG 5%)처럼 대부분 실패하나, 성공 기업(예: 금융/의료 도메인) 사례와 측정 벤치마크가 없어 실무 가이드라인 부족.
    • Suggested resources: McKinsey AI ROI report 2025, Gartner Magic Quadrant for Enterprise AI
  • 🟡 AI 인프라 투자 버블 위험과 대안

    • 대규모 투자(칩, 데이터센터) 배경에서 Forrester의 25% 지연 예측과 연결되나, 버블 붕괴 위험 및 효율적 대안(엣지 컴퓨팅 등) 미탐구.
    • Suggested resources: NVIDIA market analysis 2025, CB Insights AI infrastructure bubble report
  • 🟡 비즈니스 리더들의 AI 기대 관리 전략

    • 리더들의 '미래 신뢰 vs 현재 좌절' 갭이 명확하나, hype 관리와 단계적 도입 프레임워크 부재로 전략적 지식 갭.
    • Suggested resources: Forrester AI Hype Cycle 2026, Seth Godin blog on tech transitions

💡 AI Insights

이 노트는 AI hype와 비즈니스 현실 간 괴리를 강조하며, sycophancy 같은 기술적 결함과 ROI 미달성이 채택 지연을 초래함을 드러냄. 관련 노트들은 AI 철학/학습 성공을 보완하나, 성공 사례와 위험 관리 갭이 지식 네트워크의 약점으로, 실무 중심 연결 강화 필요.

Back to All Posts
NEW

뉴스레터 서비스가 정식 시작되었습니다!

매주 금요일, 옵시디언으로 정리한 AI 인사이트를 메일함으로 배달해 드립니다.