Skip to content

Using the online CogView4 model from Zhipu AI to generate an image. The result mostly met the expectation, but the text on the image was all in English instead of the specified Chinese.

I haven't tested the open-source version, but the online version should theoretically be more powerful.

It seems that the prompt words are too complicated, and it cannot understand or follow them, or is English still prioritized internally?

Zhipu AI Entry https://bigmodel.cn/trialcenter/modeltrial

The following is the prompt word

Please draw a picture:
### Overall Layout
- Simple cartoon style
- The image is divided into two parts, the left is "Before OpenAI", and the right is "After OpenAI", connected by an arrow (→).
- Each part contains two scenes (Top: Coding, Bottom: Bug Fixing)

### Left: Before OpenAI
1. **Top Part: Developer Coding**
   - Background: A simple desk with an old computer monitor on it.
   - Character: A cartoon developer (round head), sitting in front of the computer, looking focused and a little confused.
   - Text: Write "Developer Coding - 2 hours" in a bubble above the developer's head or at the top of the screen.

2. **Bottom Part: Developer Debugging**
   - Background: Also a desk and computer, but the developer looks tired, frustrated, holding his head in his hands, staring at the computer screen.
   - Character: The same cartoon developer, looking pained.
   - Text: Write "Developer Bug Fixing - 6 hours" in a bubble above the developer's head or at the top of the screen.

### Right: After OpenAI
1. **Top Part: ChatGPT generates code**
   - Background: Also a desk and computer, but there may be an icon prompting ChatGPT next to the computer screen.
   - Character: The developer is sitting in front of the computer, looking relaxed or surprised, indicating that the code has been generated by ChatGPT.
   - Text: Write "ChatGPT generates code - 5 minutes" in a bubble above the developer's head or at the top of the screen.

2. **Bottom Part: Developer Debugging**
   - Background: Desk and computer, the developer looks more tired and desperate, holding his head in his hands.
   - Character: The same cartoon developer, looking more painful.
   - Text: Write "Developer Bug Fixing - 24 hours" in a bubble above the developer's head or at the top of the screen.

The actual generated image

The effect of a simple prompt word test is good

It seems that at least the text embedding support is not yet in place for complex multi-scene scenarios, but it is especially suitable for corresponding simple scenarios, especially beaches and advertisements, and the effect is very good.