Obsidian Metadata
-
Prompt design fundamentals
- Be specific in your instructions: Craft clear and concise instructions that leave minimal room for misinterpretation.
- Add a few examples to your prompt: Use realistic few-shot examples to illustrate what you want to achieve.
- Break it down step-by-step: Divide complex tasks into manageable sub-goals, guiding the model through the process.
- Specify the output format: In your prompt, ask for the output to be in the format you want, like markdown, JSON, HTML and more.
- Put your image first for single-image prompts: While Gemini can handle image and text inputs in any order, for prompts containing a single image, it might perform better if that image (or video) is placed before the text prompt. However, for prompts that require images to be highly interleaved with texts to make sense, use whatever order is most natural.
-
Troubleshooting your multimodal prompt
- If the model is not drawing information from the relevant part of the image: Drop hints with which aspects of the image you want the prompt to draw information from.
- If the model output is too generic (not tailored enough to the image/video input): At the start of the prompt, try asking the model to describe the image(s) or video before providing the task instruction, or try asking the model to refer to what’s in the image.
- To troubleshoot which part failed: Ask the model to describe the image, or ask the model to explain its reasoning, to gauge the model’s initial understanding.
- If your prompt results in hallucinated content: Try dialing down the temperature setting or asking the model for shorter descriptions so that it’s less likely to extrapolate additional details.
- Tuning the sampling parameters: Experiment with different temperature settings and top-k selections to adjust the model’s creativity.

