Introduction – New Paths for Data-Driven Decisions
Imagine your new advertising campaign is launching tomorrow. Wouldn’t you feel more at ease if you already knew today how your target audience would respond? Traditional market research takes weeks for this. With neuroflash’s digital twins, you get the answer almost instantly.
Market research is one of the most important foundations for successful decision-making. But traditional methods—surveys, interviews, or panels—are time-consuming, expensive, and limited to people willing to participate in surveys. This is especially problematic in the B2B sector with specialized target groups: few participants, high incentives, and weeks of execution time. Here, AI can help: Using large language models, digital twins are created that realistically simulate real customer reactions. But what is this “Silicon Sampling” all about, how does it work, and what benefits does it offer, especially for marketing teams in small and medium-sized businesses? The following article explores these questions—clearly explained for anyone interested in data-driven decisions.
AI-Based Target Groups – What Are Digital Twins?
The Use of Digital Twins refers to a new method in market research, where instead of real survey participants, AI models take on the role of consumers. Specifically, large language models (LLMs)—such as the latest GPT models—are conditioned to simulate people with specific demographic characteristics and behaviors (Aher, Arriaga, & Kalai, 2023). The AI model thus generates digital twins and answers questions in the way a real person from the target group would.
Studies show that a sufficiently trained language model can capture even the finest differences between various subgroups (keyword: algorithmic fidelity) and respond true to the original distribution (Argyle et al., 2023; Motoki et al., 2023). Simply put: the AI can imitate, for example, the mindset of a 35-year-old marketing manager or a 50-year-old head of procurement with surprising nuance—including the attitudes, preferences, and language quirks typical for that persona.
These digital twins make it possible to gather opinions and reactions without having to interview a single real person. The method was first tested in scientific experiments in 2022/23 (including Aher et al., 2023; Argyle et al., 2023) and has since been continually refined. neuroflash is now bringing this idea into corporate practice: instead of conducting elaborate in-house surveys, companies can “interview” the AI and still receive valid, almost human-like responses—quickly, flexibly, and cost-efficiently.
Why Use Digital Twins?
AI-generated survey participants offer a number of tangible advantages over traditional market research:
- Precision & Depth of Insight: Despite all skepticism toward AI, in many cases, the responses of an LLM already reflect real opinion distributions with remarkable accuracy. Several studies have found high levels of agreement compared to actual survey results (Motoki et al., 2023), typically around 80% or more. Research shows that LLMs can predict public opinion (Chu et al., 2023). In other words: the AI’s “opinions” deviate only minimally from the actual results you would get from a human sample. The generated responses are also consistent and detailed—often mentioning similar key topics and justifications as real interviews. For marketers, this means: digital twins provide actionable insights that stand up in the real world (Brand et al., 2023).
- Speed & Flexibility: Results are available almost in real time. Instead of sending out questionnaires and waiting for responses, the AI generates comprehensive consumer feedback datasets within minutes (Ke, Zhao, & McAuley, 2024). This enables the spontaneous testing of new ideas, content, or hypotheses. Adjustments are also straightforward—questions can be reformulated at any time, target parameters changed—the virtual participants are available 24/7 without ever getting tired.
- Cost Efficiency: Traditional market research is expensive—especially when it comes to reaching specialized B2B segments. Setting up a representative expert panel or conducting decision-maker interviews is often associated with significant costs, sometimes well over €100,000 per study. In contrast, digital twins cost only a fraction of that. For example, neuroflash offers unlimited studies with up to 5,000 digital twin respondents for just €999 per month. By comparison: a comparable survey with real participants would have cost over €10,000 and taken several weeks (Brand, Israeli, & Ngwe, 2023).
How Are Digital Twins Surveyed?
For an AI to serve as a virtual representative of real people, it must be skillfully “fed” with the relevant characteristics of the desired target group. The method takes place in three steps (Argyle et al., 2023):
- Creating Digital Twins: First, the client provides a persona description. Demographic characteristics are defined (age, gender, education, occupation, etc.), as well as industry affiliation if relevant, and known behaviors and attitudes (interests, media usage, buying behavior, etc.). The more detailed and realistic this profile is, the more convincing the simulated answers will be. (In scientific studies, real profile data was used, for example, to condition the AI with diverse background stories.) For a company, one might describe a “typical customer” like this: “42-year-old IT manager at a medium-sized industrial company, tech-savvy, reads trade magazines like X, values data security and budget efficiency.”
- Generating Responses Through AI: Next, the model is given concrete market research questions—about product ideas, advertising messages, or purchasing criteria, for example. The AI now answers as the defined persona would (Aher et al., 2023). It’s important here to use high-quality prompt formulations, the right model, and the best data foundation to ensure the AI truly takes on the role. Research shows that only well-configured LLMs deliver highly plausible, human-like responses—including explanations, priorities, and even emotional nuances that match the profile (Motoki et al., 2023). This role playing by the AI makes the generated responses so valuable, as they convey authenticity rather than generic platitudes.
- Analyzing Patterns: In the final step, the AI responses are collected and analyzed—basically just like a regular survey analysis. You can identify frequencies and trends, cluster open-ended responses by theme, or compare different virtual subgroups. This process, often referred to as AI survey analysis, leverages large-scale language models to simulate and synthesize human-like feedback. It offers a novel layer of insight, especially when traditional data is scarce or slow to collect. Interestingly, similar patterns emerge in these data as in real surveys. Studies have shown that preferences and opinions simulated by AI correlate very strongly with the results of actual surveys (Argyle et al., 2023; Motoki et al., 2023). The results can then be visualized and interpreted just like classic market research insights. Important: Of course, you shouldn’t blindly trust AI results—especially when it comes to controversial or strategically sensitive questions—but rather see them as supplementary insights. Still, they provide a quick, indicative impression that often points in the right direction and helps prioritize further testing (Ke et al., 2024).
The neuroflash Method: Human-Grounded AI Sampling
While many providers operate digital twins based on general demographic prompts or media consumption data, neuroflash goes a step further: our virtual target groups are not only statistically modeled, but also deeply grounded in psychological science and anchored in real human science profiles.
Since 2017, we have been researching methods to simulate human behavior in digital systems. Starting with lean Word2Vec models, through the first transformer models (BERT), to today’s generative LLMs, we have not only accompanied this development but shaped it with our own methodological contributions—especially with a focus on behavioral and psychological realism.
Our foundation: an in-house curated dataset with thousands of real panel profiles, where each individual was asked over 500 questions—about their values, emotions, attitudes toward authority, everyday norms, and patterns of social orientation. The questions are deeply rooted in evolutionary psychology, sociology, and modern personality research. This allows us to identify differences in, for example, attitudes toward authority, normative ideals, or political-moral foundations—key factors for realistic consumer profiles.
These real profiles serve as a blueprint for the creation of our digital twins:
- Each simulated persona is based on a real, internally consistent, and humanly complex case.
- When necessary, information is meaningfully supplemented to reflect the client’s requirements as precisely as possible.
- With this approach, we achieve higher “human fidelity”: our digital twins not only provide statistically plausible average answers—but also display strong opinions, divergent perspectives, and psychologically explainable attitudes.
In addition, we have years of experience dealing with bias, social desirability, and cognitive distortions—both on a technical and psychological level. Our expertise has already been utilized by leading companies such as Beiersdorf, Volkswagen, Adidas, various insurance providers, banks, and international agency networks.
Conclusion: What makes neuroflash unique is not just the technological access to LLMs, but the combination of data psychology, AI architecture, and empirical human research. This enables our clients to receive digital twins that are not generic, but true digital twins—capable of responding authentically and with nuance.
Areas of Application: What Can Be Researched with Virtual Target Groups?
There are hardly any limits to its application—wherever companies want to understand the opinions or behaviors of their (potential) customers, digital twins can be used (Brand et al., 2023). However, as with any method, it is important to be aware of certain limitations when interpreting the results.
✅ Use Cases
- Brand Messaging & Communication Marketing teams can test which messages or advertising slogans resonate with different target groups. The AI personas react to slogans or campaign ideas and indicate what they think about them. This makes it easy to see, for example, whether a particular tone of voice works better with tech experts than with business generalists, or which product benefits should be highlighted.
- Product and Concept Testing Before launching a new product or feature, you can survey the digital target group: How is the product idea received? Which features are most important? Would they be willing to pay for it? This allows for early feedback without expensive test markets or focus groups. Even specific B2B contexts—like feedback from purchasing decision-makers on a SaaS offering—can be simulated in this way, which would otherwise be difficult due to small sample sizes.
- Opinion and Trend Forecasts How would my customer group react to market changes? What happens if a new competitor appears or a particular event occurs? AI can be used to play out hypothetical scenarios. Social trends or policy changes can also be anticipated in terms of public sentiment (Chu et al., 2023). For example: “How would our customers likely react to a 10% price increase?”—the simulated answers at least provide an indication of whether acceptance or dissatisfaction would prevail.
⚠️ Limitations to Consider
- Behavioral Change is Needed Using AI tools effectively requires a mindset shift within the team. It takes time to get familiar with prompt crafting, understanding how models respond, and iterating toward useful results. Without training or buy-in, any new approach might be underutilized.
- Generic Responses without Prompt Tuning Language models (especially ChatGPT) tends to give overly generic answers unless well instructed. One needs to fine-tune prompts or use pre-built frameworks to receive differentiated, persona-aligned feedback.
- Limited Contextual Awareness of Current Trends Unless enriched with timely background information, LLMs might miss out on very recent market movements. Either external updates (e.g. via RAG or manual input) or fine-tuned, newer models are required to reflect evolving contexts accurately.
- Pricing Predictions Are Not Reliable Yet Models are still weak in predicting realistic price thresholds or trade-off preferences, especially in B2B environments. Elasticity and willingness-to-pay often require empirical calibration that goes beyond linguistic simulation.
- Image Evaluation Works, but Needs Validation While AI can comment on visuals or ad creatives, the evaluation of images by virtual personas is still experimental and lacks scientific validation. Interpret these insights with caution until broader benchmarks are available.
- Challenges in Representing Diverse Human Perspectives. While LLMs can provide insightful and articulate responses across a wide range of topics, they may not always reflect the full diversity of human opinions—particularly those of specific demographic groups. Even when prompted to adopt certain viewpoints, these models can struggle to reproduce the nuance found in real-world public opinion (Santurkar et al., 2023). This highlights an ongoing challenge in treating LLMs as “silicon subjects”: their outputs are shaped by patterns in training data, which may not fully capture the breadth of societal perspectives. Care should be taken when interpreting their responses in contexts where demographic sensitivity or representational balance is important.
- Identifying Effective Conditioning Variables for Modeling Human Perspectives. While persona prompting can help large language models simulate a broader range of perspectives, its effectiveness is currently constrained by the limited explanatory power of the demographic, social, and behavioral variables commonly used. In many subjective NLP settings, these variables account for only a small fraction of the variation in human responses, which caps the potential gains from conditioning on them (Hu & Collier, 2023). This highlights an essential open question: identifying which types of variables—beyond standard demographics—are most informative for guiding LLM behavior in subjective or value-laden contexts. Without more meaningful conditioning signals, attempts to personalize or diversify model outputs may remain shallow or inconsistent.
- Lack of Methodological Rigor in Persona Generation Introduces Systemic Bias. The growing use of LLM-generated personas in fields such as public opinion modeling and social simulation offers compelling promise, but many techniques are methodologically underdeveloped. Most persona generation approaches rely on heuristics rather than principled, empirically validated frameworks, resulting in simulations that may reflect systemic biases rather than mitigate them. These limitations can produce substantial deviations from real-world population-level outcomes, as seen in applications like election forecasting and survey modeling (Li et al., 2025)
Advantages for Small and Medium-Sized Businesses
SMBs, in particular, benefit enormously from this AI method. Many insights that were once reserved for large corporations with big market research budgets are now easily accessible. The key advantages at a glance:
Market Research Despite Tight Budgets: Digital twins make it possible to ask targeted questions and obtain valid data without major investments. Even with a small marketing budget, you can “simulate” regular surveys and work in a data-driven way. The principle of “knowing more for less money” becomes reality—a real competitive advantage for SMBs (Ke et al., 2024).
Lightning-Fast Customer Insights: Instead of waiting weeks for results, insights are available immediately. This allows for an agile approach—campaigns can be optimized in real time, content ideas validated instantly, and decisions accelerated (Argyle et al., 2023). In today’s fast-paced business world, this time gain is invaluable.
New Data Sources as a Supplement: Digital twins don’t have to be a replacement but can be used in addition to traditional methods. AI-generated data brings additional depth and angles to the analysis that might not emerge from small real samples alone. For example, an AI evaluation can provide clues about implicit needs or wording that wouldn’t be captured in a quantitative survey. This gives companies a richer overall picture and allows them to generate hypotheses that can later be tested in reality. Research sees great potential here in using LLMs as “subpopulation representative models” to more reliably capture the opinions of different subsegments (Wang, Zhang, & Zhang, 2023).
Not to be forgotten: this AI method is practically scalable as needed. Whether you want to “survey” 50 or 5,000 simulated participants, there is hardly any difference in time or cost—a scaling advantage that especially benefits small businesses.
Conclusion – AI-Powered Market Research as a Game Changer
The use of digital twins opens up entirely new possibilities for gaining customer insights (Aher et al., 2023). Companies—especially SMBs—can now obtain insights at the push of a button that would previously have required significant effort. Of course, this method does not replace traditional market research in every case. But it is an efficient tool for quickly testing hypotheses, identifying early trends, and making more informed decisions before committing large budgets (Brand et al., 2023).
The current scientific results are encouraging: in many scenarios, the AI’s answers closely match real survey data (Motoki et al., 2023), follow economic principles (e.g., realistic price-value perception), and provide qualitative insights that closely resemble human thought processes (Argyle et al., 2023). At the same time, one should be aware of the current limitations—for example, that AIs may carry certain biases from their training data or reach their limits with very specific niche questions. However, the technology is evolving rapidly, and with every LLM upgrade, the fidelity of the simulation increases (Ke et al., 2024).
For marketing decision-makers, this means: get started now, gain experience, and use digital twins as a supplementary source of insight. This way, trends can be identified faster, actions can be planned more precisely, and overall, smarter marketing decisions can be made—scientifically validated and at the same time pragmatic for everyday use. neuroflash’s “digital twin” method thus makes cutting-edge AI research directly applicable in practice and helps smaller companies compete on equal footing with larger competitors by leveraging data-driven approaches.
Summary for sharing
Digital Twins by neuroflash for Data-Driven Decisions
neuroflash’s digital twins approach enhances data-driven decision-making by leveraging cutting-edge AI models to create virtual target groups. These simulated consumers respond with near-human accuracy—fast, flexible, and cost-efficient.
Key benefits at a glance:
- High Precision: Studies show digital twins’ responses align ~80-85% with real human surveys, providing deep and reliable insights.
- Speed & Flexibility: Real-time results enable swift testing of new ideas, hypotheses, and marketing messages.
- Significant Cost Efficiency: Drastically reduces costs compared to traditional methods (e.g., unlimited neuroflash studies at €999/month vs. traditional surveys costing over €10,000 per study).
- Robust Psychological Validity: neuroflash grounds digital twins in deeply psychological and empirically validated profiles, ensuring authentic, nuanced responses rather than generic AI outputs.
- Optimized AI Technology: Combines three distinct AI models (Word2Vec, BERT, GPT) to leverage their individual strengths—up-to-date data, semantic precision, and generative flexibility.
Versatile applications:
- Immediate testing of marketing messages
- Agile product and concept testing
- Rapid opinion and trend predictions
Especially beneficial for small and medium-sized enterprises, neuroflash provides unprecedented opportunities for affordable, quick, and practical market research.
Conclusion: neuroflash’s digital twins rapidly deliver robust and cost-effective insights, complementing or partially replacing traditional market research, accelerating data-based decisions, and providing a competitive advantage for businesses of all sizes.
References
Aher, G., Arriaga, R. I., & Kalai, A. T. (2023). Using large language models to simulate multiple humans and replicate human subject studies. arXiv preprint arXiv:2208.10264. https://arxiv.org/abs/2208.10264
Argyle, L. P., Busby, E. C., Fulda, N., Gubler, J. R., Rytting, C., & Wingate, D. (2023). Out of one, many: Using language models to simulate human samples. arXiv preprint arXiv:2302.07257. https://arxiv.org/abs/2302.07257
Bhatia, S. (2023). Inductive reasoning in minds and machines. Psychological Review, 130(4), 734–752. https://doi.org/10.1037/rev0000446
Brand, J., Israeli, A., & Ngwe, D. (2023). Using GPT for market research. SSRN. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4395751
Brown, T. B., Mann, B., Ryder, N., et al. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 1877–1901. https://doi.org/10.48550/arXiv.2005.14165
Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183–186. https://doi.org/10.1126/science.aal4230
Chu, S., Goyal, P., Gottipati, S., Wang, X., Li, J., Levy, R., & Zhang, Y. (2023). Language models trained on media diets can predict public opinion. arXiv preprint arXiv:2303.16779. https://arxiv.org/abs/2303.16779
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of NAACL-HLT 2019, 4171–4186. https://doi.org/10.18653/v1/N19-1423
Hu, Tiancheng, and Nigel Collier. “Quantifying the persona effect in llm simulations.” arXiv preprint arXiv:2402.10811 (2024). https://arxiv.org/pdf/2402.10811
Ke, L., Zhao, Z., & McAuley, J. (2024). Speed vs. fidelity: A benchmarking study of LLMs for simulated respondent sampling. arXiv preprint arXiv:2402.18144. https://arxiv.org/abs/2402.18144
Li Ang, et al. “LLM Generated Persona is a Promise with a Catch.” arXiv preprint arXiv:2503.16527 (2025). https://arxiv.org/pdf/2503.16527
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781. https://arxiv.org/abs/1301.3781
Motoki, K., Lin, Y., Albarracín, D., & Wu, C. M. (2023). Evaluating the fidelity of large language models in simulating human survey responses. arXiv preprint arXiv:2309.06364. https://arxiv.org/abs/2309.06364
Park, J. S., Zou, C. Q., Shaw, A., Cao, J., Berger, B., Hill, B. M., & Macy, M. W. (2024). Generative agent simulations of 1,000 people. arXiv preprint arXiv:2411.10109. https://arxiv.org/abs/2411.10109
Santurkar, Shibani, et al. “Whose opinions do language models reflect?.” International Conference on Machine Learning. PMLR, 2023. https://proceedings.mlr.press/v202/santurkar23a/santurkar23a.pdf
Wang, M., Zhang, D. J., & Zhang, H. (2023). Large language models as subpopulation representative models: A review. arXiv preprint arXiv:2310.17888. https://arxiv.org/abs/2310.17888