"If I'm gonna let Samsung fuck me, I'm not gonna add Ronald McDonald and make it a menage a trois"Noir Vesper

AI Waifus and You 101

Azehara

Well-known member
!!Foot Dox Confirmed!!
Early Adopter
Joined:  Sep 11, 2022
Looks like she's holding a fusion rifle (with a broken middle finger).
Guns are pretty hard to get. I think you have to create LORA for those also.
 

RestlessRain

Well-known member
Early Adopter
Joined:  Sep 21, 2022
Okay, question: how do I get Stable Diffusion to draw characters only, without any background? I'd like to make a composite image but not sure of what prompts to use to get them on a clear background to do it.
 

Clem the Gem

Unknown member
Early Adopter
Joined:  Sep 10, 2022
Okay, question: how do I get Stable Diffusion to draw characters only, without any background? I'd like to make a composite image but not sure of what prompts to use to get them on a clear background to do it.
Should be able to just use "simple background, green background" to get a solid green you can cut out. Or any other colour of course.
 

Lesbian Solid Snake

Pettan Hag Supremacy
Joined:  Sep 19, 2022
I have the opposite question.

Does anyone have any tips or recommendations for generating some nice anime style backgrounds? I'm looking to make a Screensaver that's a slide show of illustrated nature scenery. I found a ghibli style Lora and I've been messing around with that but it keeps putting people in my pictures which is not what I want.
 

Watamate

Previously known as Tatsunoko
Early Adopter
Joined:  Oct 8, 2022
I have the opposite question.

Does anyone have any tips or recommendations for generating some nice anime style backgrounds? I'm looking to make a Screensaver that's a slide show of illustrated nature scenery. I found a ghibli style Lora and I've been messing around with that but it keeps putting people in my pictures which is not what I want.
Hmm, that's very strange. These are some random ones that I generated without really doing anything special (Granted, I used ComfyUI but doesn't make a difference). The best thing I can suggest is adding prompts to your Negative Prompts. Stuff like "(People:1.5), (1girl:1.5), (1boy:1.5)", etc. It's also just possible that the ghibli style Lora was trained on predominantly images with people in them. But you can probably still brute force it.

ComfyUI_00003_.pngComfyUI_00005_.png

Edit: I'm not sure if this is the one that you're currently using, but it seems a lot more specific. Especially combined with their all-purpose checkpoint/SD model:

 

Een Gevolg

Undead Nurse #1
Early Adopter
Joined:  Sep 10, 2022
If for some reason you can't find any LoRAs of your favorite brand of virtual e-girl/e-boy, I'd recommend trying to make your own. The process generally doesn't require anything excessive hardware-wise (and even if you don't have enough, you can set up a Colab for it with a bit more work), and setting it up is surprisingly easy; if I could figure it out, so can you. Tested with Mashiro since he's one of the few Vtubers I couldn't find a LoRA for; most of the generations are kinda bad, but that's likely a mix of not having a good enough imageset (could fix that by just using Pixiv, that's all but guaranteed to have enough material), and me not knowing what the fuck I'm doing. Do note that LoRAs only really apply to local models; if you're using NovelAI, tough luck, since as far as I know they have no plans of supporting LoRA.
00004-731913265.png
 

Clem the Gem

Unknown member
Early Adopter
Joined:  Sep 10, 2022
So I'm just now looking into this thing called ControlNet which I've seen mentioned a bit before.

You feed it one image of something to use as a reference, be it a character's likeness or a particular pose, then enter your desired prompt and it will build from the reference picture. Similar to img2img I guess but much more powerful. I have so far had just the quickest look at it and was impressed enough to show it here.

Here is my reference image of a pose:


and with a simple prompt, ended up with:


Prompt:
(masterpeice:1.2, best quality:1.2, ultra detailed, cinematic lighting, sharp focus, highly detailed face),
shishiro botan, hololive, running, jungle, blurry background, motion lines, motion blur,
grey hair, lgrey eyes, ong hair, lion ears, lion tail, shorts, tank top,

It also generated this image, so you can see how it is interpreting the pose. I do not yet know if you are able to manipulate this at all to get more accurate results.


Installation and info here: https://aituts.com/controlnet/

[Edit] For reference, here is what I got using the pose picture in a straight up img2img at different strengths:
 
Last edited:

Watamate

Previously known as Tatsunoko
Early Adopter
Joined:  Oct 8, 2022
I think I might have mentioned my fondness for Wildcards and the Dynamic Prompt extension for Automatic 1111. And the past days or so I've been working on trying to create the ultimate Hololive Member Wildcard/Lora combo. Each member has an accommodated Lora (Some have multiple, in case you want to change the info in the .txt file).

Sadly I can't credit the creators of each Lora since there are so many and I have the habit of renaming them just to make things easy for me. But if you use the Civitai Helper extension (https://github.com/butaixianran/Stable-Diffusion-Webui-Civitai-Helper) It does catch about 90% of them.

Fairly easy to use, install the Dynamic Prompts extension (https://github.com/adieyal/sd-dynamic-prompts.git), the Additional Network extension (https://github.com/kohya-ss/sd-webui-additional-networks). I'm honestly not sure if you even need the latter extension anymore, but just to be safe. Place the Hololive folder in your Lora folder, and the HololiveMember.txt in your \webui\extensions\sd-dynamic-prompts\wildcards folder.

Then all you have to do is add __HololiveMember__ to your prompt and it should randomly pick out a Holo JP/EN/ID member to go along with the rest of your prompt. There is one bug where you can't/shouldn't increase your batch size higher than 1, if you want to generate multiple images and have each image be of a random member then you have to increase the batch count.


Here's a 20-batch count generation I made to give you a decent idea of what it's capable of (Fairly large image size).

grid-0071.png
 

The Proctor

Manager Arc Unlocked?
Staff member
Joined:  Sep 9, 2022
So I'm just now looking into this thing called ControlNet which I've seen mentioned a bit before.

You feed it one image of something to use as a reference, be it a character's likeness or a particular pose, then enter your desired prompt and it will build from the reference picture. Similar to img2img I guess but much more powerful. I have so far had just the quickest look at it and was impressed enough to show it here.

It also generated this image, so you can see how it is interpreting the pose. I do not yet know if you are able to manipulate this at all to get more accurate results.
View attachment 22385

Installation and info here: https://aituts.com/controlnet/

Now that's seriously cool. Exactly the kind of progression of the tech I was curious about when it first began cropping up.
 

Clem the Gem

Unknown member
Early Adopter
Joined:  Sep 10, 2022
I think I might have mentioned my fondness for Wildcards and the Dynamic Prompt extension for Automatic 1111. And the past days or so I've been working on trying to create the ultimate Hololive Member Wildcard/Lora combo. Each member has an accommodated Lora (Some have multiple, in case you want to change the info in the .txt file).
.......
Maybe not really my thing, since I tend to go in already knowing who I want the subject to be, but I can see the extension being useful for giving random locations, poses and such. it's nice to have all the Holo LORAs and descriptions in one place though.
One thing you might want to do is make multiple lines for those LORAs who have different outfits baked in which require a trigger word. (Also noticed you used fox girl / fox ears / fox tail to describe Mio instead of wolf)
Saying that, here's a picture using your __HololiveMember__ prompt, combined with a ControlNet pose:



(masterpeice:1.2, best quality:1.2, highly detailed, cinematic lighting, sharp focus, perfect face, absurdres),
__HololiveMember__ karate pose in dojo, gym, karate gi, white pants, barefoot
Negative: (worst quality, low quality:1.4), easynegative, gloves

There is one bug where you can't/shouldn't increase your batch size higher than 1, if you want to generate multiple images and have each image be of a random member then you have to increase the batch count.
Seemed to work fine for me as you can see, unless I'm misunderstanding.


Now that's seriously cool. Exactly the kind of progression of the tech I was curious about when it first began cropping up.

If you haven't already, take a look at this for some inspiration of what's currently possible: https://github.com/lllyasviel/ControlNet
I spent a while trying to think up something cool that's Vtuber related I could demonstrate, but I lack the imagination unfortunately.
It kind of hints at it, but what I'd really like to see in the future is the ability to directly modify the skeleton real-time in the UI to get that perfect pose. For now those, you'll just need to draw them bones yourself.
 
Last edited:

RestlessRain

Well-known member
Early Adopter
Joined:  Sep 21, 2022
I spent a while trying to think up something cool that's Vtuber related I could demonstrate, but I lack the imagination unfortunately.
I might have a go at using this to draw Pippa with an assault rifle. One of the few things that AI art hates more than hands and fingers are guns (even after trying multiple Loras), so I'll see if this helps.
 

Clem the Gem

Unknown member
Early Adopter
Joined:  Sep 10, 2022
Ok, this is just getting silly. Remember how just before I said it would be cool to be able to be able to manipulate the skeleton yourself in-editor? Yeah, there's already an extension for that.

But that wasn't enough - there's even a god damn 3D version as well, so you can pose your character as if you were in a 3D modelling program.Right there in the webUI.

I then find you can have multiple ControlNets doing different things at the same time. For example one ControlNet figures out the pose from the sekeleton you give it, and another uses a depth map to look at the composition of an image and will insert your perfectly posed character into the scene.

Just look at this shit:





I might have a go at using this to draw Pippa with an assault rifle. One of the few things that AI art hates more than hands and fingers are guns (even after trying multiple Loras), so I'll see if this helps.

I have spent some time trying to come up with a solution using the knowledge I just gained, but haven't got there yet.

Trying to get a gun with a prompt only does not go very well, as you already know:



Using an OpenPose ControlNet, we can get her holding...something at least in the correct pose. It's at least starting to look like a real gun:



I got another reference picture of a gun and tried getting it into the image in a few different ways but have so far been unsuccessful.
 
Last edited:

Watamate

Previously known as Tatsunoko
Early Adopter
Joined:  Oct 8, 2022
I figured it's been long enough since my last Chatbot-related post to give an update.

The Good

Plenty of options, between Claude, Todd, GPT-4 proxies, etc. New Character Cards are being made daily. Continued development of TavernAI/SillyTavern.

The Bad

Still stuck using Character Cards and as of yet, no central hub for said cards. Even though there are multiple sites trying to vie for that position. But no option is as good as the Illusion game example https://illusioncards.booru.org/. To my knowledge. And even though we have a lot of options for models currently, it's in constant shift.

The Ugly

It's obvious that no current AI Developer wants their AI used in this manner, and I'm sure for the foreseeable future, later updates will continue to crack down/make it more difficult. So it will most likely continue to be a two-sided battle.


Options​

  • Tavern-Based​

SillyTavern: https://github.com/Cohee1207/SillyTavern
TavernAI: https://github.com/TavernAI/TavernAI

As of right now, there is no real reason to use TavernAI as the current meta using proxies is made a lot easier with SillyTavern. Both of these are Front-End GUIs and are useless without actually connecting them with an AI model. Now as far as back ends, there are currently 3 main meta.

Slack Claude Proxy: https://github.com/AmmoniaM/Spermack
My personal favorite and what I've been using for the past few days. Comparable with GPT-4 levels of quality but requires some wrangling with the Jailbreak prompt to not constantly get filtered. But once you have a good setup, it's amazing.

Todd Proxy: https://toddbot.net/v1
Allegedly based on Claude, easy to use, and since it's based on Claude the quality is fairly decent as well. Do note this is a Meme model based on Todd Howard and it will occasionally add references to Bethesda Games in its responses regardless of which cards you use. But you can easily just regenerate a response.

GPT-4/GPT-3.5 Turbo Proxies: https://alwaysfindtheway.github.io / https://boards.4channel.org/g/ Chat Bot thread
Basically what it says on the tin. GPT-3.5 Turbo is fairly readily available but is easily the worst model listed here. But that doesn't mean it's necessarily horrible. GPT-4 on the other hand is still considered the crème de la crème, but is harder to find proxies for and has the tendency to run out of credits fairly quickly.

If you need a guide on how to get up and running once you have a proxy key that you'd like to use:

All of this is useless if you don't have any character cards to actually use. Which you can get here:
4chan /g/ & /vt/ Chatbot threads
  • Web-Based​

Character.ai:
As much as I hate to admit it, still the best web-based AI chatbot client. Incredibly easy to use, huge assortment of bots, and are improving their bot memory/context size. Free/Filtered

Agnai:
Probably the second-best option. Unlike Character.ai, this does not have a bot browser and you will have to resort to finding Character Cards like you would with TavernAI/SillyTavern. Free/Unfiltered

Charstar:
Haven't personally used it, but seems similar to Character.ai's interface. Most likely running some sort of OpenAI model. 100 free messages a day/25$ a month for unlimited. Premium/Unfiltered (Don't be stupid, there are better free alternatives)

Venusai:
Very new. Never used it, my guess is it's running GPT-3.5 Turbo. I think the person created it as a Character Card-based bot browser in the same vein as Character.ai. Free/Unfiltered
 
Last edited:

Clem the Gem

Unknown member
Early Adopter
Joined:  Sep 10, 2022
I have not yet dabbled in text AI, other than spending a few minutes on character.ai. Skimming your links there, it mentions finding proxies and API keys to connect to the AI servers.
Is this something where you need to be contantly looking for servers to connect to and relying on people sharing/giving them out? It's not something you just download once and run locally?
 

Watamate

Previously known as Tatsunoko
Early Adopter
Joined:  Oct 8, 2022
I have not yet dabbled in text AI, other than spending a few minutes on character.ai. Skimming your links there, it mentions finding proxies and API keys to connect to the AI servers.
Is this something where you need to be contantly looking for servers to connect to and relying on people sharing/giving them out? It's not something you just download once and run locally?
The Todd Proxy is fairly consistent, he's pretty active on 4chan. Claude I haven't really had any issues using either outside of one throwaway account being banned. But the nice thing about SillyTavern/TavernAI is that your chats are recorded locally, so if you switch halfway through your chats are still there and included in the prompts. Besides those two they're fairly hit or miss. https://whocars123-oai-proxy.hf.space/proxy/openai/v1 is also consistent although it requires a password now (which you put in the API key field).

Running locally currently is unfeasible sadly unless you have a top-of-the-line RTX 4090 or a workstation GPU with 24GB of VRAM. And even then it's not going to be Claude/GPT-4 quality.

Edit: Removed the password since it's been changed. But you can find it fairly easily.
 
Last edited:

Clem the Gem

Unknown member
Early Adopter
Joined:  Sep 10, 2022
2girls 1prompt
Experiemtns with Latent Couple, Tiled Diffusion and more.

As you probably know, it's pretty much impossible to get 2 different characters posed in a scene and accurately describe them using a prompt alone. You'll usualy end up with a mutant fused together from the two characters.

Enter Latent Couple.

Using ControlNet to take a pose from a reference image, let's get a nice scene with two characters:



Now using Latent Couple, we paint a rough composition of the image, using a different colour for each thing that you want to separate, and giving each a unique prompt:


This will give us a final prompt of:

(masterpeice:1.2, best quality:1.2, highly detailed, cinematic lighting, sharp focus, perfect face, absurdres), two girls running, beach, sunset, barefoot, backlighting
AND 1girl, hololive, ((shirakami fubuki)), white hair, ponytail, fox ears, long hair, aqua eyes, fox tail, (blue bikini), open mouth, happy
AND 1girl, hololive, ((shiranui flare)), pointy ears, blonde hair, long hair, (dark skin), red eyes, (white one-piece swimsuit), ribbon, straw hat, open mouth, happy

Off to a pretty good start:


Now you'll notice some of them don't like quite right. In fact the last one doesn't look like the character at all. What we really want is to use LORAs.
What we would do here is add the LORAs to each character's prompt, and enable another plugin, Composable LORA. Without this, LORAs would be applied to the entire image, rather than just the selected areas that we want.
Unfortunately, the plugin has aparently been broken recently, so I wasn't able to test it.

The good news is there is yet another plugin we can use as an alternative to Latent Couple / Composable LORAs, called Tiled Diffusion.
This plugin is primarily for making extra large images, but also has a Region Prompt Control feature.
Like before, we specify different regions of the image, but this time you're limited to simple rectangular selections. The good news is supports LORAs and you can also add negative prompts for each section:



Since we can now use LORA, we don't have to be so detailed with our prompt either. The downside to this method is because of how it draws separate images for each section and tries to blend them together, you can end up with very obvious seams (I also swapped the position of the characters and adjusted the pose at this point):



It took many many rolls and adjusting the tile settings to get a picture that was acceptable.

Final high-res pictures. First was made with Latent Couple (no LORA) plus Hires Fix in txt2img, and the second in Tiled Diffusion in txt2img then again in img2img with upscale. Funnily, the one without LORA turned out more accurate in this case. Much quicker to get the initial low res image too. I think I'm leaning more towards this method.



Again, those plugins are:
Latent Couple
Composable LORA (currenntly broken)
MultiDiffusion / Tiled Diffusion
ControlNet
OpenPose Editor
All can be installed from the extensions tab.

Now there is yet another alternative plugin called Regional Prompter. It says LORAs can be separated "to some extent", so I guess I'll try that next and see how it is.
For now though, I should get to bed..
 
Last edited:

Watamate

Previously known as Tatsunoko
Early Adopter
Joined:  Oct 8, 2022
So I've been screwing around with SoftVC VITS Singing Voice Conversion, the thing that people use for having Pekora or whoever sing random songs. It's not amazing or anything but it's pretty fun to screw around with.


This huggingface has a few Hololive models you can use.


View attachment Gura - Wannabe.mp3

I spent a little bit more time on this. Also pretty rough, but as it goes on it gets a bit better.

View attachment Gura Korone - Hurt Feelings.mp3
 

Watamate

Previously known as Tatsunoko
Early Adopter
Joined:  Oct 8, 2022
Double posting again, since it's not related to my prior post.

I found out Agnai has 11labs support. And it was only 1$ for a month of their cheapest subscription so I figured I'd try it out. The first member I tried didn't sound anything like them (Mumei), Kronii luckily worked a lot better. But then I also had to create an entirely new card/scenario.

After all that I was able to get it to this point:

View attachment eMhle2DR2E.mp4
View attachment KWEmhkFfn9.mp4
 
Last edited:

PleaseCheckYourReceipts

Well-known member
Joined:  May 6, 2023


So, this hit my feed. It's more AI training for voice conversion models. Guy apparently also has videos on setting this stuff up, as well.
 

Clem the Gem

Unknown member
Early Adopter
Joined:  Sep 10, 2022


So, this hit my feed. It's more AI training for voice conversion models. Guy apparently also has videos on setting this stuff up, as well.

Damn, that Lui clip at the start was pretty spot-on.
 
Top Bottom