Okay, question: how do I get Stable Diffusion to draw characters only, without any background? I'd like to make a composite image but not sure of what prompts to use to get them on a clear background to do it.
Okay, question: how do I get Stable Diffusion to draw characters only, without any background? I'd like to make a composite image but not sure of what prompts to use to get them on a clear background to do it.
Does anyone have any tips or recommendations for generating some nice anime style backgrounds? I'm looking to make a Screensaver that's a slide show of illustrated nature scenery. I found a ghibli style Lora and I've been messing around with that but it keeps putting people in my pictures which is not what I want.
Does anyone have any tips or recommendations for generating some nice anime style backgrounds? I'm looking to make a Screensaver that's a slide show of illustrated nature scenery. I found a ghibli style Lora and I've been messing around with that but it keeps putting people in my pictures which is not what I want.
Hmm, that's very strange. These are some random ones that I generated without really doing anything special (Granted, I used ComfyUI but doesn't make a difference). The best thing I can suggest is adding prompts to your Negative Prompts. Stuff like "(People:1.5), (1girl:1.5), (1boy:1.5)", etc. It's also just possible that the ghibli style Lora was trained on predominantly images with people in them. But you can probably still brute force it.
Edit: I'm not sure if this is the one that you're currently using, but it seems a lot more specific. Especially combined with their all-purpose checkpoint/SD model:
If for some reason you can't find any LoRAs of your favorite brand of virtual e-girl/e-boy, I'd recommend trying to make your own. The process generally doesn't require anything excessive hardware-wise (and even if you don't have enough, you can set up a Colab for it with a bit more work), and setting it up is surprisingly easy; if I could figure it out, so can you. Tested with Mashiro since he's one of the few Vtubers I couldn't find a LoRA for; most of the generations are kinda bad, but that's likely a mix of not having a good enough imageset (could fix that by just using Pixiv, that's all but guaranteed to have enough material), and me not knowing what the fuck I'm doing. Do note that LoRAs only really apply to local models; if you're using NovelAI, tough luck, since as far as I know they have no plans of supporting LoRA.
So I'm just now looking into this thing called ControlNet which I've seen mentioned a bit before.
You feed it one image of something to use as a reference, be it a character's likeness or a particular pose, then enter your desired prompt and it will build from the reference picture. Similar to img2img I guess but much more powerful. I have so far had just the quickest look at it and was impressed enough to show it here.
It also generated this image, so you can see how it is interpreting the pose. I do not yet know if you are able to manipulate this at all to get more accurate results.
I think I might have mentioned my fondness for Wildcards and the Dynamic Prompt extension for Automatic 1111. And the past days or so I've been working on trying to create the ultimate Hololive Member Wildcard/Lora combo. Each member has an accommodated Lora (Some have multiple, in case you want to change the info in the .txt file).
Sadly I can't credit the creators of each Lora since there are so many and I have the habit of renaming them just to make things easy for me. But if you use the Civitai Helper extension (https://github.com/butaixianran/Stable-Diffusion-Webui-Civitai-Helper) It does catch about 90% of them.
Fairly easy to use, install the Dynamic Prompts extension (https://github.com/adieyal/sd-dynamic-prompts.git), the Additional Network extension (https://github.com/kohya-ss/sd-webui-additional-networks). I'm honestly not sure if you even need the latter extension anymore, but just to be safe. Place the Hololive folder in your Lora folder, and the HololiveMember.txt in your \webui\extensions\sd-dynamic-prompts\wildcards folder.
Then all you have to do is add __HololiveMember__ to your prompt and it should randomly pick out a Holo JP/EN/ID member to go along with the rest of your prompt. There is one bug where you can't/shouldn't increase your batch size higher than 1, if you want to generate multiple images and have each image be of a random member then you have to increase the batch count.
So I'm just now looking into this thing called ControlNet which I've seen mentioned a bit before.
You feed it one image of something to use as a reference, be it a character's likeness or a particular pose, then enter your desired prompt and it will build from the reference picture. Similar to img2img I guess but much more powerful. I have so far had just the quickest look at it and was impressed enough to show it here.
It also generated this image, so you can see how it is interpreting the pose. I do not yet know if you are able to manipulate this at all to get more accurate results. View attachment 22385
I think I might have mentioned my fondness for Wildcards and the Dynamic Prompt extension for Automatic 1111. And the past days or so I've been working on trying to create the ultimate Hololive Member Wildcard/Lora combo. Each member has an accommodated Lora (Some have multiple, in case you want to change the info in the .txt file).
.......
Maybe not really my thing, since I tend to go in already knowing who I want the subject to be, but I can see the extension being useful for giving random locations, poses and such. it's nice to have all the Holo LORAs and descriptions in one place though.
One thing you might want to do is make multiple lines for those LORAs who have different outfits baked in which require a trigger word. (Also noticed you used fox girl / fox ears / fox tail to describe Mio instead of wolf)
Saying that, here's a picture using your __HololiveMember__ prompt, combined with a ControlNet pose:
(masterpeice:1.2, best quality:1.2, highly detailed, cinematic lighting, sharp focus, perfect face, absurdres),
__HololiveMember__ karate pose in dojo, gym, karate gi, white pants, barefoot
Negative: (worst quality, low quality:1.4), easynegative, gloves
There is one bug where you can't/shouldn't increase your batch size higher than 1, if you want to generate multiple images and have each image be of a random member then you have to increase the batch count.
If you haven't already, take a look at this for some inspiration of what's currently possible: https://github.com/lllyasviel/ControlNet
I spent a while trying to think up something cool that's Vtuber related I could demonstrate, but I lack the imagination unfortunately.
It kind of hints at it, but what I'd really like to see in the future is the ability to directly modify the skeleton real-time in the UI to get that perfect pose. For now those, you'll just need to draw them bones yourself.
I might have a go at using this to draw Pippa with an assault rifle. One of the few things that AI art hates more than hands and fingers are guns (even after trying multiple Loras), so I'll see if this helps.
Ok, this is just getting silly. Remember how just before I said it would be cool to be able to be able to manipulate the skeleton yourself in-editor? Yeah, there's already an extension for that.
But that wasn't enough - there's even a god damn 3D version as well, so you can pose your character as if you were in a 3D modelling program.Right there in the webUI.
I then find you can have multiple ControlNets doing different things at the same time. For example one ControlNet figures out the pose from the sekeleton you give it, and another uses a depth map to look at the composition of an image and will insert your perfectly posed character into the scene.
I might have a go at using this to draw Pippa with an assault rifle. One of the few things that AI art hates more than hands and fingers are guns (even after trying multiple Loras), so I'll see if this helps.
I figured it's been long enough since my last Chatbot-related post to give an update.
The Good
Plenty of options, between Claude, Todd, GPT-4 proxies, etc. New Character Cards are being made daily. Continued development of TavernAI/SillyTavern.
The Bad
Still stuck using Character Cards and as of yet, no central hub for said cards. Even though there are multiple sites trying to vie for that position. But no option is as good as the Illusion game example https://illusioncards.booru.org/. To my knowledge. And even though we have a lot of options for models currently, it's in constant shift.
The Ugly
It's obvious that no current AI Developer wants their AI used in this manner, and I'm sure for the foreseeable future, later updates will continue to crack down/make it more difficult. So it will most likely continue to be a two-sided battle.
As of right now, there is no real reason to use TavernAI as the current meta using proxies is made a lot easier with SillyTavern. Both of these are Front-End GUIs and are useless without actually connecting them with an AI model. Now as far as back ends, there are currently 3 main meta.
Slack Claude Proxy:https://github.com/AmmoniaM/Spermack
My personal favorite and what I've been using for the past few days. Comparable with GPT-4 levels of quality but requires some wrangling with the Jailbreak prompt to not constantly get filtered. But once you have a good setup, it's amazing.
Todd Proxy: https://toddbot.net/v1
Allegedly based on Claude, easy to use, and since it's based on Claude the quality is fairly decent as well. Do note this is a Meme model based on Todd Howard and it will occasionally add references to Bethesda Games in its responses regardless of which cards you use. But you can easily just regenerate a response.
GPT-4/GPT-3.5 Turbo Proxies:https://alwaysfindtheway.github.io / https://boards.4channel.org/g/ Chat Bot thread
Basically what it says on the tin. GPT-3.5 Turbo is fairly readily available but is easily the worst model listed here. But that doesn't mean it's necessarily horrible. GPT-4 on the other hand is still considered the crème de la crème, but is harder to find proxies for and has the tendency to run out of credits fairly quickly.
If you need a guide on how to get up and running once you have a proxy key that you'd like to use:
This is a guide for retards like you, who want to roleplay debauchery things with an LLM! But lack the braincells to figure it out how on your own. This guide will be handholding you like the delicate little infant you are, and make sure that you'll be able to romance your underage little girlfri...
rentry.org
All of this is useless if you don't have any character cards to actually use. Which you can get here:
Find, share, modify, convert, and version control characters and other data for conversational large language models (LLMs). Previously/AKA CharHub, CharaHub, Char Hub.
Unofficial character card depository for use with compatible LLM frontends. Use the "Download original" button to ensure metadata is preserved. Sign in to upload and edit tags. Don't know how to register/upload? See RENTRY.ORG/PYGMABOORU
booru.plus
4chan /g/ & /vt/ Chatbot threads
Web-Based
Character.ai:
As much as I hate to admit it, still the best web-based AI chatbot client. Incredibly easy to use, huge assortment of bots, and are improving their bot memory/context size. Free/Filtered
Agnai:
Probably the second-best option. Unlike Character.ai, this does not have a bot browser and you will have to resort to finding Character Cards like you would with TavernAI/SillyTavern. Free/Unfiltered
Charstar:
Haven't personally used it, but seems similar to Character.ai's interface. Most likely running some sort of OpenAI model. 100 free messages a day/25$ a month for unlimited. Premium/Unfiltered (Don't be stupid, there are better free alternatives)
Venusai:
Very new. Never used it, my guess is it's running GPT-3.5 Turbo. I think the person created it as a Character Card-based bot browser in the same vein as Character.ai. Free/Unfiltered
I have not yet dabbled in text AI, other than spending a few minutes on character.ai. Skimming your links there, it mentions finding proxies and API keys to connect to the AI servers.
Is this something where you need to be contantly looking for servers to connect to and relying on people sharing/giving them out? It's not something you just download once and run locally?
I have not yet dabbled in text AI, other than spending a few minutes on character.ai. Skimming your links there, it mentions finding proxies and API keys to connect to the AI servers.
Is this something where you need to be contantly looking for servers to connect to and relying on people sharing/giving them out? It's not something you just download once and run locally?
The Todd Proxy is fairly consistent, he's pretty active on 4chan. Claude I haven't really had any issues using either outside of one throwaway account being banned. But the nice thing about SillyTavern/TavernAI is that your chats are recorded locally, so if you switch halfway through your chats are still there and included in the prompts. Besides those two they're fairly hit or miss. https://whocars123-oai-proxy.hf.space/proxy/openai/v1 is also consistent although it requires a password now (which you put in the API key field).
Running locally currently is unfeasible sadly unless you have a top-of-the-line RTX 4090 or a workstation GPU with 24GB of VRAM. And even then it's not going to be Claude/GPT-4 quality.
Edit: Removed the password since it's been changed. But you can find it fairly easily.
2girls 1prompt Experiemtns with Latent Couple, Tiled Diffusion and more.
As you probably know, it's pretty much impossible to get 2 different characters posed in a scene and accurately describe them using a prompt alone. You'll usualy end up with a mutant fused together from the two characters.
Enter Latent Couple.
Using ControlNet to take a pose from a reference image, let's get a nice scene with two characters:
Now using Latent Couple, we paint a rough composition of the image, using a different colour for each thing that you want to separate, and giving each a unique prompt:
This will give us a final prompt of:
(masterpeice:1.2, best quality:1.2, highly detailed, cinematic lighting, sharp focus, perfect face, absurdres), two girls running, beach, sunset, barefoot, backlighting
AND 1girl, hololive, ((shirakami fubuki)), white hair, ponytail, fox ears, long hair, aqua eyes, fox tail, (blue bikini), open mouth, happy
AND 1girl, hololive, ((shiranui flare)), pointy ears, blonde hair, long hair, (dark skin), red eyes, (white one-piece swimsuit), ribbon, straw hat, open mouth, happy
Off to a pretty good start:
Now you'll notice some of them don't like quite right. In fact the last one doesn't look like the character at all. What we really want is to use LORAs.
What we would do here is add the LORAs to each character's prompt, and enable another plugin, Composable LORA. Without this, LORAs would be applied to the entire image, rather than just the selected areas that we want.
Unfortunately, the plugin has aparently been broken recently, so I wasn't able to test it.
The good news is there is yet another plugin we can use as an alternative to Latent Couple / Composable LORAs, called Tiled Diffusion.
This plugin is primarily for making extra large images, but also has a Region Prompt Control feature.
Like before, we specify different regions of the image, but this time you're limited to simple rectangular selections. The good news is supports LORAs and you can also add negative prompts for each section:
Since we can now use LORA, we don't have to be so detailed with our prompt either. The downside to this method is because of how it draws separate images for each section and tries to blend them together, you can end up with very obvious seams (I also swapped the position of the characters and adjusted the pose at this point):
It took many many rolls and adjusting the tile settings to get a picture that was acceptable.
Final high-res pictures. First was made with Latent Couple (no LORA) plus Hires Fix in txt2img, and the second in Tiled Diffusion in txt2img then again in img2img with upscale. Funnily, the one without LORA turned out more accurate in this case. Much quicker to get the initial low res image too. I think I'm leaning more towards this method.
Now there is yet another alternative plugin called Regional Prompter. It says LORAs can be separated "to some extent", so I guess I'll try that next and see how it is.
For now though, I should get to bed..
So I've been screwing around with SoftVC VITS Singing Voice Conversion, the thing that people use for having Pekora or whoever sing random songs. It's not amazing or anything but it's pretty fun to screw around with.
so-vits-svc fork with realtime support, improved interface and more features. - GitHub - voicepaw/so-vits-svc-fork: so-vits-svc fork with realtime support, improved interface and more features.
github.com
This huggingface has a few Hololive models you can use.
Double posting again, since it's not related to my prior post.
I found out Agnai has 11labs support. And it was only 1$ for a month of their cheapest subscription so I figured I'd try it out. The first member I tried didn't sound anything like them (Mumei), Kronii luckily worked a lot better. But then I also had to create an entirely new card/scenario.
After all that I was able to get it to this point:
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.