@keepthepace

keepthepace@slrpnk.net · 1 month ago

Good to know, thanks!

keepthepace@slrpnk.net · 1 month ago

Ah that must be it sorry. I thought they had decorelated phone numbers and IDs

keepthepace@slrpnk.net · 1 month ago

Groups have an encryption key that I guess you receive from other members upon joining.

keepthepace@slrpnk.net · 1 month ago

Spaces is an underused feature that I hope see gain more traction! It makes Matrix a credible competitor to Slack and Discord

keepthepace@slrpnk.net · 1 month ago

Not really, have used it for years like that. But you need to set it up initially on your phone. The newish feature (less than a year) is that I think they do not require a phone number to set up a new account.

keepthepace@slrpnk.net · 2 months ago

That’s really interesting! It shows which communities share users. I am part of jlai.lu, a french-speaking community that is relatively isolated by also slrpnk.net that seems very spread out!

Would it make sense to compute the standard deviation of each instance’s communities? It would give an idea of which are islands and which are more extended. Not sure if it makes sense to compute it more on 2 dimensions or on the original 21934 though.

keepthepace@slrpnk.net · edit-2 5 months ago

It would probably be more effective to put an explicit mention in the prompt. “Your interlocutor is a <gendered term> and will be greatly offended to be refered to as a boy or a man.”

keepthepace@slrpnk.net · 6 months ago

I’d have a slightly different take: managing things in-house is going to be cheaper if you have a competent team to do it. The existence of the cloud as a crucial infrastructure is because it is hard to come up with competent IT and sysadmin people. The market is offer-driven now. IT staff could help the company save money on AWS hosting but it could also be used in more crucial and profitable endeavour and this is what is happening.

I see it at the 2 organization I am working at: one is a startup which does have a single, overworked “hardware guy” who sets up the critical infra of the company. His highest priority is to maintain the machine with private information that we want to host internally for strategic reasons. We calculated that having him install a few machines for hosting our dev team data was the cheapest but after 3 months of wait, we opted out for a more expensive, but immediately available, cloud option. We could have hired a second one but our HR department is already having a hard time finding candidates for out crucial missions.

On the non-profits I am working on, there is a strong openness/open-hardware spirit. Yet I am basically the only IT guy there. I often joke they should ditch their Microsoft, Office and Google based tools, and I could help them do it, but I prefer to work on the actual open hardware research projects they are funding. And I think I am right in my priorities.

So yes, the Cloud is overpriced, but it is a convenience. Know what you pay for, know you could save money there and it may at some point be reasonable to do so. In the end that’s a resource allocation problem: human time vs money.

keepthepace@slrpnk.net · 6 months ago

The Huggingface page has examples of how to use it: https://huggingface.co/ibm-granite/granite-8b-code-instruct

keepthepace@slrpnk.net · 8 months ago

My point is that using “grokking” in ML is not a Musk/Twitter/Whatever-his-Ai-company-is-named invention, it predates their use.

Yes the original researchers reused a pre-existing meaning, which has been in internet for a while before. I did not know it came from Heinlein and I did not know its full meaning. I remember seeing it first, more than a decade ago, in a text that explained without any explanation that an isolated unknown word can easily be groked from context. Demonstrating it immediately. To me (and I guess to those researchers) “grok” means “understanding from context” which is particularly appropriate in the context.

BTW Elon was not the only one to reuse this word. Another company named Groq, totally unrelated to Musk as far as I know, designs AI acceleration chips.

keepthepace@slrpnk.net · 8 months ago

Grokking is actually a concept in ML, when a model’s loss start suddenly lower far after it is considered to have overfit. That notion was named by researchers, I’ll let people decide if it is aptly named, but Elon likely just took it from there.

keepthepace@slrpnk.net · 8 months ago

I really want this lemmy community to grow and strive but for that thing, I thought it was too important to not post it on the biggest community out there, so I made a post on /r/localllama to incite a collective response. Feel free to collaborate of cross-post/copy the message here: https://old.reddit.com/r/LocalLLaMA/comments/1b7iwxi/we_should_make_a_collective_rlocallama_answer_for/

keepthepace@slrpnk.net · edit-2 8 months ago

I read the questions asked there and it is clear that it comes from people who have done their homeworks and are positive about open models already. Answering their questions in depth enough is pretty involved and would probably take me 1-2 days to bring up citations and articles.

It could be interesting to make a collaborative answer.

keepthepace@slrpnk.net · 8 months ago

As a non-US citizen can/show I comment?

keepthepace@slrpnk.net · 8 months ago

I don’t understand how we are supposed to file a comment?

keepthepace@slrpnk.net · 9 months ago

Someone asked something similar in reddit a few days ago: https://old.reddit.com/r/Anarchism/comments/1amqzc4/foss_for_selforganizing_groups/

keepthepace@slrpnk.net · 9 months ago

Note that he did not confirm is was mistral-medium. He says that’s a retrained llama2-70B model, but hints that it is not the fully trained one. Sounds a bit like damage control but is not a 100% confirmation of the claim.

keepthepace@slrpnk.net · 11 months ago

Nice! It feels like a direct answer to Karpathy comment on Mistral, where he said it is nice to call it “open weight” but not “open source” because we still don’t know the dataset and the training code. LLM360 seem to be fully open source by that definition and releases even the checkpoints!

Performance wise, a bit lagging (under a Llama2 of the same size) but all the tools are there to improve it!