{"id":55185,"date":"2026-05-01T15:46:05","date_gmt":"2026-05-01T14:46:05","guid":{"rendered":"https:\/\/edoardoguzzi.com\/"},"modified":"2026-05-13T13:29:28","modified_gmt":"2026-05-13T12:29:28","slug":"agent-skills-of-anthropic-inside-n8n-how-you-really-do-it","status":"publish","type":"post","link":"https:\/\/edoardoguzzi.com\/en\/agent-skills-of-anthropic-inside-n8n-how-you-really-do-it\/","title":{"rendered":"Anthropic's Agent Skills inside n8n: how it's really done"},"content":{"rendered":"<p>It happens one Thursday. The agent has been responding badly for a week, and when you finally open the system prompt you understand why: eight thousand tokens, all in one block. Always in context. Even when the user is just asking to translate an email.<\/p>\n\n\n\n<p>It's called prompt bloat. And it is the quickest way to make an AI agent stupid and expensive at the same time.<\/p>\n\n\n\n<p>Over the past few months I've been working on the same internal assistant for WebWakeUp several times, and each time I've ended up in the same wall: a single agent that has to do too many different things (write meta descriptions in the style of client X, generate PDFs on templates, do competitor research, answer support tickets), and a system prompt that grows with each new skill added. The model starts to confuse the rules, responses fall flat, costs go up. Classic.<\/p>\n\n\n\n<p>Then I skimmed Anthropic's documentation on the <em>Agent Skills<\/em>, I kept the parts that were needed and tried to bring the pattern inside n8n self-hosted. It works. It works well. But with some distinctions that I'm not finding written out there, and so we're talking about it here.<\/p>\n\n\n\n<p>What follows is a technical reading, a work-in-progress version. I am still testing, so take it as an open construction site, not a finished manual. If you see things that don't add up, write to me: the piece is designed to do just that.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">The problem, in three lines<\/h2>\n\n\n\n<p>An AI agent has a cost per shift proportional to how big its system prompt is. If you stuff all the instructions for all possible tasks into the system prompt, you pay that price even when the user is asking for the time. If you keep them out instead, the agent doesn't know how to do anything specific. It's a problem as old as the first processed prompts, and it's where Anthropic first put its hand, cleanly.<\/p>\n\n\n\n<p>The answer they built is called <em>progressive disclosure<\/em>, which can be translated as \u201clayered context loading.\u201d The agent sees only a tiny poster at all shifts, and loads the rest only when really needed.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Three levels, making all the difference<\/h2>\n\n\n\n<p>The Anthropic pattern divides context loading into three layers. Keeping them separate is key to the whole story.<\/p>\n\n\n\n<p><strong>Level one, always loaded.<\/strong> Just name and description of each skill, one line each. Thirty or fifty tokens per skill stuff. Costs little, lives in the system prompt, serves the LLM to do recognition (figure out if the user's request matches any of the available skills).<\/p>\n\n\n\n<p><strong>Level two, loaded on match.<\/strong> When the LLM recognizes a relevant skill, it calls a tool that returns to it the complete instructions for that skill. Five hundred, a thousand, two thousand tokens: it depends on the skill. They enter the context only for that round (or those subsequent rounds as long as needed); they are not reloaded with each request.<\/p>\n\n\n\n<p><strong>Level three, loaded on demand of the skill itself.<\/strong> A skill can, within its instructions, say \u201cif the situation is X, read the example file as well.\u201d The agent calls a second tool, downloads the additional references, and keeps them in context only as long as it needs them. Long examples, edge cases, reference tables, live here.<\/p>\n\n\n\n<p>The result is that the system prompt remains small for life, even with fifty or a hundred skills in the catalog. You pay for the extra tokens only when the user makes a request that justifies them. No water, no out-of-pocket costs.<\/p>\n\n\n\n<p>Anthropic obviously serves it in its products (Claude Code, its Skills, the Agent SDK). The question is: can you replicate the same pattern inside n8n, without inventing anything? The answer is yes, and almost everything is done with native nodes.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Replicate it inside n8n, with native nodes<\/h2>\n\n\n\n<p>The toy assembles with four pieces: AI Agent (version 3.1, from 1.82.0 onward it is Tools Agent fixed and that's okay), Call n8n Workflow Tool (version 2.2), Execute Workflow Trigger (version 1.1), Postgres (version 2.6). Plus a small Code node to build the manifest. Stop.<\/p>\n\n\n\n<p>The idea is simple. My \u201cskill files\u201d do not live as <code>.md<\/code> on the filesystem (as Anthropic does), but as rows in a Postgres table. Two tables, actually: one for skills, one for their additional references. The MAIN_AGENT, at the start of each session, makes a <code>SELECT<\/code> of all active skills and build the manifest on the spot. The sub-workflows exposed as tools serve the model to ask \u201cgive me skill X\u201d or \u201cgive me reference Y of skill X\u201d when it needs it.<\/p>\n\n\n\n<p>No custom code, no external plugins, no extra APIs to maintain. Just nodes.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Postgres yes. Data Tables no (and I say why)<\/h2>\n\n\n\n<p>I'm playing an opinion here, and I'm keeping the doors open: I read it that way, you tell me.<\/p>\n\n\n\n<p>n8n has the <em>Data Tables<\/em> integrated by a couple of versions. Native, interface-managed tables with an exposed REST API endpoint (<code>\/datatables<\/code>) to interact with it even from outside, native CSV import\/export, scoped access by project. For prototyping a system like this they're perfect: you get up a workflow in half an hour, rows are edited by clicking, and if you need an external panel the API is there.<\/p>\n\n\n\n<p>But to put a skill system on top of it that we claim to hold up months, years, and grow, there are documented limitations that must be weighed. Nothing dramatic, no \u201cclosed box,\u201d but definite limits that change the choice.<\/p>\n\n\n\n<p><strong>Column types<\/strong>: Boolean, Date, Number, String. No JSON, no VECTOR, no BLOB. So no embedding on the same table, no arbitrary structured payload.<\/p>\n\n\n\n<p><strong>Filters<\/strong>: Equals, Not Equals, Greater Than and Less Than (with or without equal), Is Empty, Is Not Empty. No LIKE, no full-text, no similarity. For skill pattern matching you retrieve the manifest and amen, no server-side pre-filtering.<\/p>\n\n\n\n<p><strong>Operations<\/strong>: Insert, Update, Upsert, Delete, Get on rows. Create, Delete, List, Update on the tables. No raw SQL.<\/p>\n\n\n\n<p><strong>Storage cap<\/strong>: default 50MB for all Data Tables in an instance. On self-hosted liftable via the environment variable <code>N8N_DATA_TABLES_MAX_SIZE_BYTES<\/code>. n8n alerts at 80%, blocks insert and update at 100%. For a catalog of textual skills it may suffice for a long time, but it is a ceiling to watch.<\/p>\n\n\n\n<p><strong>No access from Code node<\/strong>: verbatim quote from docs, <em>\u201cDirect programmatic access to data tables from a Code node isn't supported.\u201d<\/em> For the architecture I described to you above, where a Code node packs the manifest, this is a direct problem. Workable (Set node after a Get-many instead of Code), but it changes shape.<\/p>\n\n\n\n<p>On three things the docs are silent, so I don't claim it: native versioning, behavior under heavy concurrency, and whether n8n backups include the contents of Data Tables. I don't know, I'm going to check that. If you have precise references on these three points, write to me.<\/p>\n\n\n\n<p>Summing up, I'll take Postgres anyway, and there are five reasons, all of them concrete.<\/p>\n\n\n\n<p><strong>First<\/strong>, the 50MB cap I want it out of my head. A skill registry that could grow on versioning and history I don't want it dependent on a threshold to monitor.<\/p>\n\n\n\n<p><strong>According to<\/strong>, I want server-side similarity search the day the skills are a hundred or more. pgvector is ready, on the Data Tables no VECTOR type on the horizon.<\/p>\n\n\n\n<p><strong>Third<\/strong>, I want the Code node free to read and package data. I need it today for AI Agent pipeline and I will need it tomorrow for export and backup pipeline.<\/p>\n\n\n\n<p><strong>Fourth<\/strong>, I want raw SQL for the day I need an aggregation, a JOIN, a history trigger. The history trigger in Postgres is a ten row function, on Data Tables it has to be built by hand in workflow.<\/p>\n\n\n\n<p><strong>Fifth<\/strong>, I am fine with an extra container. My n8n instance already lives in docker compose, add a Postgres with image <code>pgvector\/pgvector:pg16<\/code> costs nothing in operational friction.<\/p>\n\n\n\n<p>If the goal were a small rubric, ten or twenty fixed skills with no pretensions to growth, Data Tables would be a perfectly defensible choice. In my case, with a growth horizon I do not yet know but want not to block, I took the extra container.<\/p>\n\n\n\n<p>Maybe I'm wrong. If someone brought in a skill system on Data Tables at true scale (one hundred, two hundred entries, with history and external admin via the API <code>\/datatables<\/code>), I would love to hear it: the debate is open.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">The Postgres schema, with pgvector from day one<\/h2>\n\n\n\n<p>Dedicated container, don't reuse n8n's. I want to be able to do separate backups, I want to not dirty n8n's operational database with my application data, I want to be able to migrate the whole toy somewhere else someday without having to disassemble n8n. It's cleaner that way.<\/p>\n\n\n\n<p>Important point that I confirmed later: the right image is <code>pgvector\/pgvector:pg16<\/code>, not <code>postgres:16-alpine<\/code>. It is standard Postgres plus the preinstalled pgvector extension. Without the extension enabled, it behaves like a vanilla Postgres, zero overhead. However, the day you want to enable similarity search for the jump to manifest-via-RAG, you're already there: no image migration, no <code>pg_dump<\/code> cold, no time wasted.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>services:\n  wwu-skills-db:\n    image: pgvector\/pgvector:pg16\n    container_name: wwu-skills-db\n    restart: unless-stopped\n    environment:\n      POSTGRES_DB: wwu_skills\n      POSTGRES_USER: wwu_skills_user\n      POSTGRES_PASSWORD: ${WWU_SKILLS_DB_PASSWORD}\n    volumes:\n      - wwu_skills_data:\/var\/lib\/postgresql\/data\n    networks:\n      - n8n_network\nvolumes:\n  wwu_skills_data:\nnetworks:\n  n8n_network:\n    external: true<\/code><\/pre>\n\n\n\n<p>No exposed port on the host. Only n8n talk to this Postgres, on the internal docker network. The password lives in an environment variable, not in the file.<\/p>\n\n\n\n<p>The actual pattern. Three tables: <code>skills<\/code>, <code>skill_references<\/code>, <code>skills_history<\/code>. More a trigger that at each <code>UPDATE<\/code> archive the previous version, so versioning you have it for free.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>CREATE SCHEME IF NOT EXISTS skills_mgmt;\n \n-- Day 1: enable the extension. Zero cost if not used.\nCREATE EXTENSION IF NOT EXISTS vector;\n \nCREATE TABLE skills_mgmt.skills (\n    name VARCHAR(80) PRIMARY KEY,\n    description TEXT NOT NULL,\n    content TEXT NOT NULL,\n    active BOOLEAN NOT NULL DEFAULT TRUE,\n    version INTEGER NOT NULL DEFAULT 1,\n    -- Embedding column, nullable for now. Dimension matches your embedding\n    -- model: 1536 for text-embedding-3-small, 3072 for text-embedding-3-large.\n    description_embedding vector(1536),\n    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),\n    updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),\n    CHECK(name ~ '^[a-z0-9-]+$')\n);\n \nCREATE TABLE skills_mgmt.skills_references (\n    id SERIAL PRIMARY KEY,\n    skills_name VARCHAR(80) NOT NULL\n                    REFERENCES skills_mgmt.skills(name) ON DELETE CASCADE,\n    reference_name VARCHAR(120) NOT NULL,\n    content TEXT NOT NULL,\n    updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),\n    UNIQUE (skill_name, reference_name),\n    CHECK (reference_name ~ '^[a-z0-9_.-]+$')\n);\n \nCREATE TABLE skills_mgmt.skills_history (\n    id BIGSERIAL PRIMARY KEY,\n    name VARCHAR(80) NOT NULL,\n    description TEXT,\n    content TEXT,\n    version INTEGER,\n    archived_at TIMESTAMPTZ NOT NULL DEFAULT NOW()\n);\n \nCREATE OR REPLACE FUNCTION skills_mgmt.fn_skills_history()\nRETURNS TRIGGER AS $$\nBEGIN\n    INSERT INTO skills_mgmt.skills_history(name, description, content, version)\n    VALUES (OLD.name, OLD.description, OLD.content, OLD.version);\n    NEW.version := OLD.version + 1;\n    NEW.updated_at := NOW();\n    RETURN NEW;\nEND;\n$$ LANGUAGE plpgsql;\n \nCREATE TRIGGER trg_skills_history.\nBEFORE UPDATE ON skills_mgmt.skills.\nFOR EACH ROW EXECUTE FUNCTION skills_mgmt.fn_skills_history();\n \nCREATE INDEX idx_skills_active\n    ON skills_mgmt.skills(active) WHERE active = TRUE;<\/code><\/pre>\n\n\n\n<p>I <code>CHECK<\/code> on names are not strictly necessary, because the queries are parameterized by n8n's Postgres node and injection is already covered. But they are belt more suspenders: if someone bypasses the workflows and inserts by hand, at least the database does not let strange identifiers through.<\/p>\n\n\n\n<p>The HNSW index for similarity search is created only when you have really populated the column <code>description_embedding<\/code>. Creating it empty on a NULL column does no good, and it takes up memory.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">The three workflows<\/h2>\n\n\n\n<p>Three separate workflows. One main one, two very small ones that just act as tools.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Workflow A: <code>tool_load_skill<\/code><\/h3>\n\n\n\n<p>Three knots and that's it. <em>Execute Workflow Trigger<\/em> with <code>inputSource: workflowInputs<\/code> and a single input declared <code>{ name: \"skill_name\", type: \"string\" }<\/code>. <em>Postgres<\/em> in mode <code>executeQuery<\/code> With the query below. A node <em>Set<\/em> final which normalizes the output, returning either <code>{ skill_name, content }<\/code> or <code>{ error }<\/code>.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT name, description, content\nFROM skills_mgmt.skills\nWHERE name = $1 AND active = TRUE\nLIMIT 1;<\/code><\/pre>\n\n\n\n<p>The parameter <code>$1<\/code> is populated with <code>options.queryReplacement<\/code> equal to <code>={{ $json.skill_name }}<\/code>. No string concatenation in the query, no injection possible, n8n sanitize the value before passing it to the driver. Documented, checked, forgotten.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Workflow B: <code>tool_load_reference<\/code><\/h3>\n\n\n\n<p>Identical to the previous one, two inputs instead of one: <code>skill_name<\/code> e <code>reference_name<\/code>. Query with two parameters, <code>queryReplacement<\/code> equal to <code>={{ $json.skill_name }},={{ $json.reference_name }}<\/code>.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT content\nFROM skills_mgmt.skill_references\nWHERE skill_name = $1 AND reference_name = $2\nLIMIT 1;<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Workflow C: the <code>MAIN_AGENT<\/code><\/h3>\n\n\n\n<p>The real workflow, the one that the client or internal system hits. Four nodes in a row plus AI Agent sub-nodes.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Chat Trigger<\/strong> (version 1.4): it is the front door. A Webhook is also fine if you prefer.<\/li>\n\n\n\n<li><strong>Postgres select<\/strong> on the table <code>skills_mgmt.skills<\/code> with <code>where: active = true<\/code>, <code>outputColumns: name, description<\/code>, <code>returnAll: true<\/code>. Upload the poster.<\/li>\n\n\n\n<li><strong>Code node<\/strong> To package the manifest into a clean JSON string.<\/li>\n\n\n\n<li><strong>AI Agent<\/strong> (version 3.1) with the system message injecting manifest via expression, and the two Tool Workflows hooked as tools.<\/li>\n<\/ul>\n\n\n\n<p>The Code node is the only piece of code in the whole system. Nothing esoteric:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>\/**\n * Build a compact JSON index for the system prompt.\n * Keep description short, every char costs tokens at every turn.\n *\/\nconst rows = $input.all().map(i =&gt; i.json);\nconst index = rows.map(r =&gt; ({\n  name: r.name,\n  description: r.description\n}));\nreturn [{ json: { skills_index: JSON.stringify(index, null, 2) } }];<\/code><\/pre>\n\n\n\n<p>AI Agent's system message. Yes, it accepts n8n expressions: I confirmed it directly on the TypeScript type of the node, <code>systemMessage: string | Expression | PlaceholderValue<\/code>. Thus:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>You are the WWU AI Agent.\n \nBEFORE performing any non-trivial task, scan SKILLS_INDEX below for a matching skill.\nIf a skill matches the user request, you MUST call the `load_skill` tool with its\nexact name BEFORE producing any output. Never invent skill names. Never produce a\ndeliverable that should use a skill without first loading it.\n \nIf a loaded skill instructs you to read a reference file, call `load_reference`\nonly when the current task actually needs that level of detail.\n \nSKILLS_INDEX:\n{{ $('Build Index').item.json.skills_index }}<\/code><\/pre>\n\n\n\n<p>The two Tool Workflows need to be configured with two particularly neat things. First, the <em>description<\/em> of the tool, because the LLM reads that one and decides whether to call it. Spend five minutes writing it right, it saves you hours of wrong routing later.<\/p>\n\n\n\n<p>Second, the input parameters of the sub-workflow should be populated by clicking the \u201cAI\u201d button on the field, so that the model puts the value there via <code>$fromAI()<\/code>. Never hardcode them, never put static expressions on them, or the tool becomes dumb and the LLM slams into it.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">The flow of a request, end-to-end<\/h2>\n\n\n\n<p>The user writes, \u201cwrite me the meta description for the email marketing page.\u201d The round is this, told in the two tenses that comprise it.<\/p>\n\n\n\n<p><strong>Time one, context preparation.<\/strong> The Chat Trigger receives the message. The Postgres node pulls out the active skill rows (name and description, no content), the Code node packages them into a few KB JSON string, and the AI Agent starts with the system message already complete with manifest. All this is the same for every request, and it costs very little.<\/p>\n\n\n\n<p><strong>Time two, skill activation.<\/strong> The model reads the poster, recognizes <code>wwu-meta-description<\/code>, calls <code>load_skill<\/code> passing the exact name. The sub-workflow executes the parameterized query and returns the full content of the skill. The model reads it, decides whether it needs the additional references (and if so, calls <code>load_reference<\/code>), then write the meta description following the rules you just uploaded.<\/p>\n\n\n\n<p>The point, below.<\/p>\n\n\n\n<p>The system prompt remains the same size as ten skills or two hundred. The cost per turn changes little. The cost per prompt only goes up in proportion to how many skills are actually loaded, and that is usually one. Maybe two. Never all at once.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Sharp edges I've caught (so far)<\/h2>\n\n\n\n<p>The honest part of the piece. I'm testing, and there are things I'd like a thousand more opinions on.<\/p>\n\n\n\n<p><strong>Sub-node with expressions and first-item.<\/strong> The n8n docs say this explicitly: sub-nodes, when an expression receives an array of items, resolve only to the first one. If for some reason you pass multiple items to the Tool Workflow, the LLM sees only the first one. It means: keep the tools single-use, single-input single-output. No batch.<\/p>\n\n\n\n<p><strong>Memory soiling itself.<\/strong> If you put a long memory buffer, tool responses end up in the conversation history and are sent back to the model each subsequent turn. The contents of a skill loaded half an hour ago are still there. It's a known effect of the pattern, not a bug, but if you don't know about it, it's all over you. Solutions I'm trying: using a short window memory, or a summary memory. I have yet to decide which one holds up better in production.<\/p>\n\n\n\n<p><strong>The quality of the description is worth more than the quality of the content.<\/strong> The model chooses which skill to load by looking at the description, not the content. A skill written well but with a vague description is ignored. A skill written so-so but with a clear description is always called. Spend time there, not elsewhere.<\/p>\n\n\n\n<p><strong>System message and payload size limits.<\/strong> n8n does not document hard limits. Basically the underlying model puts the limit on you. With two hundred skills of one line of description each, you're on sixteen KB of manifest, still manageable. Above five hundred, you'd better jump to the pattern <code>search_skills<\/code> with embedding, which is why pgvector should be put in from day-one even if you don't use it right away.<\/p>\n\n\n\n<p><strong>Hot reload of skills.<\/strong> The manifest is loaded at the beginning of the session, not every turn. If you edit a skill while a conversation is already open, the LLM continues to see the old version until the end of the session. That's fine for my use case, but if you're designing a system where operators hot edit skills throughout the day, keep that in mind.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What to do concretely if you want to try it<\/h2>\n\n\n\n<p>Five steps, in order.<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Add to docker-compose next to n8n the container <code>pgvector\/pgvector:pg16<\/code>, on the same docker network, without exposing ports to the host.<\/li>\n\n\n\n<li>Create Postgres credentials in n8n towards <code>wwu-skills-db<\/code>, scheme <code>skills_mgmt<\/code>, and applies the SQL schema above.<\/li>\n\n\n\n<li>Create the sub-workflow <code>tool_load_skill<\/code>: three knots, exact copy. Ditto for <code>tool_load_reference<\/code>.<\/li>\n\n\n\n<li>Create the main workflow: Chat Trigger, Postgres select, Code node for manifest, AI Agent with the two Workflow tools hooked up. On the system message put the expression that injects <code>skills_index<\/code>. On the tool parameters, the AI button.<\/li>\n\n\n\n<li>Enter two or three test skills in the table, with differentiated descriptions. Send requests to the chatbot that should trigger only one at a time. Look in the execution logs to see if the tools are called in the right order and if the content returns complete.<\/li>\n<\/ol>\n\n\n\n<p>Step five is half the real work. First runs will show the cracks in the prompt: wrong routing, tools not called, skills loaded when not needed. This is normal. You iterate on the descriptions, iterate on the system message, file.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Where this is going<\/h2>\n\n\n\n<p>The natural leap, when skills exceed three or four digits, is to remove the flat manifest from the system prompt and replace it with a third tool: <code>search_skills(query, top_k)<\/code>. Internally it does this: it takes the user request, computes an embedding via the provider you prefer (OpenAI, Voyage, Cohere, there is the Embeddings node inside n8n), makes a query <code>vector_cosine_ops<\/code> On the skills table, it pulls out the three most similar names and that's it. The model sees only those three, loads one, works.<\/p>\n\n\n\n<p>The pgvector extension you already have it on because of the image, the column <code>description_embedding<\/code> you already have it nullable, you just need to populate it (a one-time workflow, or a trigger on <code>INSERT<\/code> e <code>UPDATE<\/code> calling the embedding provider) and create the HNSW index. Zero migration, zero downtime. This is why I insisted on starting with the pgvector image and not the standard postgres.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Sources<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/docs.n8n.io\/integrations\/builtin\/cluster-nodes\/sub-nodes\/n8n-nodes-langchain.toolworkflow\/\" target=\"_blank\" rel=\"noopener\">n8n Docs, Call n8n Workflow Tool node.<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/docs.n8n.io\/integrations\/builtin\/cluster-nodes\/root-nodes\/n8n-nodes-langchain.agent\/\" target=\"_blank\" rel=\"noopener\">n8n Docs, AI Agent (Tools Agent)<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/docs.n8n.io\/integrations\/builtin\/app-nodes\/n8n-nodes-base.postgres\/\" target=\"_blank\" rel=\"noopener\">n8n Docs, Postgres node<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/hub.docker.com\/r\/pgvector\/pgvector\" target=\"_blank\" rel=\"noopener\">pgvector\/pgvector official image Docker Hub<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/github.com\/pgvector\/pgvector\" target=\"_blank\" rel=\"noopener\">pgvector, GitHub repo<\/a><\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Closing<\/h2>\n\n\n\n<p>I am still testing this architecture. Maybe in a month I'll write that I've changed my mind about something, that the memory pattern I chose doesn't hold up under load, that the Data Tables were fine and I've made it complicated for nothing.<\/p>\n\n\n\n<p>For now, the reading is this. If anyone has tried a different approach, especially at scale (one hundred, two hundred skills or more), or has a strong opinion about the memory piece, please write to me.<\/p>\n\n\n\n<p>Here gist with a summary of the technical part in English <a href=\"https:\/\/gist.github.com\/mredodos\/b71e2ee9a4431bbbe09a0f2ed0539df5\" target=\"_blank\" rel=\"noopener\">https:\/\/gist.github.com\/mredodos\/b71e2ee9a4431bbbe09a0f2ed0539df5<\/a><\/p>\n\n\n\n<p><\/p>","protected":false},"excerpt":{"rendered":"<p>Succede un gioved\u00ec. L&#8217;agent risponde male da una settimana, e quando apri finalmente il system prompt capisci perch\u00e9: ottomila token, tutto in un blocco. Sempre in contesto. Anche quando l&#8217;utente chiede solo di tradurre una mail. Si chiama prompt bloat. Ed \u00e8 il modo pi\u00f9 rapido per rendere un AI Agent stupido e caro nello [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":55187,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[391,64,406,388,411],"tags":[386,410,385,390],"class_list":["post-55185","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-automazioni-e-integrazioni","category-blog","category-intelligenza-artificiale","category-n8n","category-no-code-low-code","tag-automazioni","tag-intelligenza-artificiale","tag-n8n","tag-self-hosting"],"meta_box":{"articolo_correlato_di_approfondimento":"","data_pubblicazione_sui_social":"1778675368","pubblicato_sui_social":"true","immagine_generata":"true"},"_links":{"self":[{"href":"https:\/\/edoardoguzzi.com\/en\/wp-json\/wp\/v2\/posts\/55185","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/edoardoguzzi.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/edoardoguzzi.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/edoardoguzzi.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/edoardoguzzi.com\/en\/wp-json\/wp\/v2\/comments?post=55185"}],"version-history":[{"count":2,"href":"https:\/\/edoardoguzzi.com\/en\/wp-json\/wp\/v2\/posts\/55185\/revisions"}],"predecessor-version":[{"id":55188,"href":"https:\/\/edoardoguzzi.com\/en\/wp-json\/wp\/v2\/posts\/55185\/revisions\/55188"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/edoardoguzzi.com\/en\/wp-json\/wp\/v2\/media\/55187"}],"wp:attachment":[{"href":"https:\/\/edoardoguzzi.com\/en\/wp-json\/wp\/v2\/media?parent=55185"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/edoardoguzzi.com\/en\/wp-json\/wp\/v2\/categories?post=55185"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/edoardoguzzi.com\/en\/wp-json\/wp\/v2\/tags?post=55185"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}